Hi! I just started using the Weaviate Python client (version 4.10.4). I have a collection movies
that includes the following properties: movie_description
(text, used for vector based search) and movie_tags
(text array). movie_tags
can have no or one or more tags, such as ‘blockbuster’, ‘high IMDb rating’, ‘crowd favorite’, etc.
I would like to define a query Filter to pass to near_text
during a semantic search, with weaviate.classes.query.Filter to do the following:
- filter to search from only objects with EMPTY
movie_tags
array
- filter to search from only objects with NON-EMPTY
movie_tags
arrray
Could you provide instructions on correctly building the filter? I’m hoping there’s a better way than using Filter.by_property("movie_tags").contains_any(all_unique_movie_tags)
Pointers to the python client documentation on filtering text arrays based on array size will also be greatly appreciated. Thank you very much!
hi @violin1443 !!
Welcome to our community 
If you want to filter by property length or null state, you need to first create a collection and specify that at inverted_index_config
, like so:
import weaviate
from weaviate import classes as wvc
client.collections.delete("Test")
collection = client.collections.create(
name="Test",
vectorizer_config=[
wvc.config.Configure.NamedVectors.text2vec_openai(name="default"),
],
inverted_index_config=wvc.config.Configure.inverted_index(
index_null_state=True,
index_property_length=True
),
properties=[
wvc.config.Property(name="movie_description", data_type=wvc.config.DataType.TEXT),
wvc.config.Property(name="movie_tags", data_type=wvc.config.DataType.TEXT_ARRAY),
]
)
collection = client.collections.get("Test")
collection.data.insert_many([
{ "movie_description": "Move desc 1. No tag"},
{ "movie_description": "Move desc 2. One Tag", "movie_tags": ["tag1"]},
{ "movie_description": "Move desc 3. Two Tags", "movie_tags": ["tag1", "tag2"]},
{ "movie_description": "Move desc 4. OverLap tags", "movie_tags": ["tag2", "tag3"]},
])
Now you can perform different searches and filters:
# movies with no tags
filters = wvc.query.Filter.by_property("movie_tags").is_none(True) # change to False if you want movies with tags
# movies with any of the given tags
filters = wvc.query.Filter.by_property("movie_tags").contains_any(["tag3", "tag2"])
# movies with tags all of the given tags
filters = wvc.query.Filter.by_property("movie_tags").contains_all(["tag3", "tag2"])
# movies with tags count > 2
# https://weaviate.io/developers/weaviate/search/filters#by-object-property-length
filters=wvc.query.Filter.by_property("movie_tags", length=True).greater_or_equal(2)
query = collection.query.near_text(
query="some movie",
filters= filters
)
for o in query.objects:
print(o.properties)
Let me know if this helps!
Thanks!
1 Like
This is very helpful! Thank you so much 
The current collection I am working on doesn’t have inverted_index_config
specified–do I have to delete and rebuild the index? Or is there any way to add these index config to the existing collection, just to avoid having to vectorize all the texts again?
Thanks a lot!
1 Like
Hi!
Yes, you will need to create a new collection and copy over the data.
Not all collection configuration is mutable. Here is a list of the ones you can change:
And here a fairly simple guide on how to migrate your data over:
Thanks!
1 Like