where_filter = {‘operator’: ‘Or’, ‘operands’: [{‘path’: [‘source’], ‘operator’: ‘Equal’, ‘valueText’: ‘https://sampleurl.com/data/OR.pdf’}]}
search_results = client.query.get(“Data”, [‘content’, ‘source’])
.with_where(where_filter)
.with_near_text({“concepts”: ques})
.with_limit(4)
.do()
I am doing the above operation. But the result contains the source with URL → https://sampleurl.com/data/OR_mietrecht_A4.pdf
where
filter checks for exact matches, right? Then how is it returning sources with different URLs ?
Hi @ananthan-123 ! Welcome back
Your property source
probably has WORD
as its tokenization setting. (the default one).
This is how you can check that using the new python v4 client:
import weaviate
client = weaviate.connect_to_local()
collections = client.collections.list_all()
for k,v in collections.items():
for property in v.properties:
print(k, property.name, property.tokenization)
You will want to set your property tokenization to FIELD
so it index the whole field.
Here more info on Tokenization:
Unfortunatelly, this configuration is not mutable:
So you need to reindex your data on a new class, with this property changed. For that, check this migration guide:
Let me know if this helps