Getting “nested query: nested clause at pos 1: invalid search term, only stopwords provided. Stopwords can be configured in class.invertedIndexConfig.stopwords” when conditional filter has an input of single or double quotes

Description

Getting “nested query: nested clause at pos 1: invalid search term, only stopwords provided. Stopwords can be configured in class.invertedIndexConfig.stopwords” when conditional filter has an input of single or double quotes

Server Setup Information

  • Weaviate Server Version: 1.23.7
  • Deployment Method: docker
  • Multi Node? Number of Running Nodes: 1
  • Client Language and Version:
  • Multitenancy?: No

Any additional Information

Basically, we have this one free text input → “skills” which uses containsAny operator filter and since its free text, users can input whatever they can, however we noticed that if users tried to enter single (') or double quotes (") then the vector database gives us the error mentioned above.

We’re seeing one quick solution is to add the single and double quotes to the stopwordsadditions, however, can you give me a sample REST/CURL command to achieve this where I just need to modify invertedIndexConfig at runtime without the need to recreate the collection, else if not possible, then do we have a better solution for it?

Please see below sample schema snippet of class Profile:

      "class": "Profile",
      "vectorizer": "text2vec-transformers",
      "invertedIndexConfig": {
        "indexNullState": true,
        "indexTimestamps": true,
        "stopwords": {
          "preset": "en",
          "additions": null,
          "removals": ["it"]
        }
      },
      "properties": [
        ...
        ... REMOVED FOR BREVITY
          {
            "name": "skills",
            "description": "Profile tags to showcase their skills, tech tools, etc.",
            "dataType": [
              "text[]"
            ],
            "moduleConfig": {
              "text2vec-transformers": {
                "skip": false,
                "vectorizePropertyName": false
              }
            }
          }
        },

hi @junbetterway !! Long time no see :slight_smile:

I was not able to reproduce this on 1.26.1 nor on 1.23.7

Please, check if my code is what you are doing:

client.collections.delete("Test")
collection = client.collections.create(
    "Test",
    vectorizer_config=wvc.config.Configure.Vectorizer.none(),
    properties=[
        wvc.config.Property(
            name="name", data_type=wvc.config.DataType.TEXT
        ),        
        wvc.config.Property(
            name="skills", data_type=wvc.config.DataType.TEXT_ARRAY
        ),
    ],
    inverted_index_config=wvc.config.Configure.inverted_index(
        stopwords_removals=["it"],
        stopwords_preset=wvc.config.StopwordsPreset.EN
    )
)
collection.data.insert({"name": "skill set 1", "skills": ["account", "hr", "sales"]})
collection.data.insert({"name": "skill set 2", "skills": ["engineering", "sales", "finance"]})
collection.data.insert({"name": "skill set 3", "skills": ["it", "training", "other"]})
collection.data.insert({"name": "skill set 3", "skills": ["'it'", "training", "other"]})

Now I can search:

for o in collection.query.fetch_objects(
    filters=wvc.query.Filter.by_property("skills").contains_any(["it"])
).objects:
    print(o.properties)

those are the versions used:

print(weaviate.__version__, "/", client.get_meta().get("version"))
# client version / server version
>>> 4.7.1 / 1.26.1

Let me know what we can do in this code to try replicating this issue.

THanks!

Yeah long time no post indeed! :smiley:

Anyways, yes looks like it gonna work but what if you try to search single or double quotes? (Not a python dev so not sure how you will search for a single quote in below code)

for o in collection.query.fetch_objects(
    filters=wvc.query.Filter.by_property("skills").contains_any([" \' "])
).objects:
    print(o.properties)