.like() and .contains_any() Not Working on Vectorized fileContent Text

Hello Weaviate Community,

I’m trying to filter my collection based on a text property fileContent using the Python client. I’ve tried both .like() and .contains_any(), but neither seems to work. Here are my attempts:
**
Using .like()**

condition = Filter.by_property(“fileContent”).like(f"{v}") if isinstance(v, str) else None

Using .contains_any()

condition = Filter.by_property(“fileContent”).contains_any([v]) if isinstance(v, str) else None

My schema for fileContent is:

{
  "dataType": ["text"],
  "indexFilterable": true,
  "indexRangeFilters": false,
  "indexSearchable": true,
  "moduleConfig": {
    "text2vec-openai": {
      "skip": false,
      "vectorizePropertyName": true
    }
  },
  "name": "fileContent",
  "tokenization": "whitespace"
}

Even though indexFilterable: true, both .like() and .contains_any() return no results.

I suspect this is because fileContent is vectorized with text2vec-openai, so string-based search operators don’t work.

Questions:

  1. Can .like() or .contains_any() be used on a vectorized text property?

  2. If not, what’s the recommended way to perform substring search on fileContent without semantic search (near_text)?

  3. Do I need to change the schema (disable vectorization) for these operators to work?

Any guidance or examples would be greatly appreciated!

Thank you!

Good morning,

I think the tokenization here is the key point, indeed you can use both as long as the property is indexed as filterable and searchable:

Best regards,

Mohamed Shahin
Weaviate Support Engineer
(Ireland, UTC±00:00/+01:00)