Struggling to dynamically build a variable filter. WARNING long post.
I have a FastAPI python application with a route receiving a request body that can be:
{
"query_text": "Orban Meloni elezioni avversari",
"lang_model_name": "intfloat/multilingual-e5-large",
"result_limit": 30,
"alpha": 0.2
}
and in this simple case I will be able to simply perform a weaviate hybrid query using query_text (which I’ll vectorize with model “intfloat/multilingual-e5-large”) and the alpha value.
But the user could elect to also fill another 4 fields in the frontend and in this case I wish to add filtering to the hybrid query. So for example a full request body could be:
{
"query_text": "Cina",
"lang_model_name": "intfloat/multilingual-e5-large",
"result_limit": 10,
"alpha": 0,
"author": "Gariazzo",
"fromIsoDate": "2024-05-11",
"toIsoDate": "2024-05-29",
"category": "Alias"
}
and in this case I can build a corresponding composite filter object as follows:
the_filter = (
Filter.by_property("isoEditionDate").greater_or_equal(
request.fromIsoDate.isoformat()
)
& Filter.by_property("isoEditionDate").less_or_equal(
request.toIsoDate.isoformat()
)
& Filter.by_property("author").like(request.author)
& Filter.by_property("category").like(request.category)
and the hybrid search would be formulated as follows:
response = wv_artcoll.query.hybrid(
query=query_string,
query_properties=[f"{WV_ENTITIES_PROPERTY}^2", WV_VECTOR_PROPERTY],
vector=query_vector,
target_vector=graphql_model_name,
limit=request.result_limit,
alpha=request.alpha,
return_metadata=MetadataQuery(score=True, explain_score=True),
filters=the_filter,
)
so far so good but now here’s the catch. Any of the 4 filtering values can be missing, so a valid request could be like:
{
"query_text": "Cina",
"lang_model_name": "intfloat/multilingual-e5-large",
"result_limit": 10,
"alpha": 0,
"fromIsoDate": "2024-05-11",
"toIsoDate": "2024-05-29",
"category": "Alias"
}
very similar to the previous one but as you can see the “author” is missing and therefore request.author has a None value and therefore the_filter would not be valid and understandably you would get an error such as the following:
ERROR - Failed to perform hybrid query: Query call with protocol GRPC search failed with message unknown value type <nil>.
So I tried to dynamically build the filter query as follows:
my_filter = None
if request.fromIsoDate is not None:
from_date_filter = Filter.by_property("isoEditionDate").greater_or_equal(
request.fromIsoDate.isoformat()
)
my_filter = (
from_date_filter if my_filter is None else my_filter & from_date_filter
)
if request.toIsoDate is not None:
to_date_filter = Filter.by_property("isoEditionDate").less_or_equal(
request.toIsoDate
)
my_filter = (
to_date_filter if my_filter is None else my_filter & to_date_filter
)
if request.author is not None:
author_filter = Filter.by_property("author").like(request.author)
my_filter = (
author_filter if my_filter is None else my_filter & author_filter
)
if request.category is not None:
category_filter = Filter.by_property("category").equal(request.category)
my_filter = (
category_filter if my_filter is None else my_filter & category_filter
)
where as you can see for every request property that is present (not None) I am building a corresponding filter object and adding it to the filter.
The hybrid filtered request would be identical to the previous one but using ‘my_filter’ instead of ‘the_filter’ with the same values in the request object.
Problem is that this search never returns anything so I’m probably not dynamically building the filter correctly.
Even though exploring a filter object is a bit cumbersome and probably the “equality” method has not been implemented, the my_filter and the_filter do not appear to be the same.
Any ideas on how to solve this use case which looks pretty common (filtering on a variable number of properties/operators/values) with the python v4 library?
Is there maybe a problem in the implementation of the operator overloading?
Python client is 4.5.6
Thanks in advance.