Issue with Vector Search Accuracy – Struggling with Negative Expressions

Rohini_vaidya · March 10, 2025, 5:34am

Hello everyone,

I am performing vector search on structured data using Weaviate and OpenAI’s Ada model for generating embeddings. However, I am facing an issue with accuracy, particularly when handling negative expressions in queries.

Issue:

For example, the queries:

“Give the name of the person who likes football”
“Give the name of the person who doesn’t like football”

return the same results, even though they should ideally be different. It seems that the model is not properly interpreting negation in queries.

What I’ve Tried:

I referred to the Weaviate HNSW tuning guide and adjusted parameters like efConstruction, ef, and maxConnections, but the issue persists.

Question:

How can I improve the accuracy of vector search to correctly handle negative expressions in queries? Are there any specific strategies, preprocessing steps, or alternative approaches to enhance the understanding of negation?

Any suggestions or insights would be greatly appreciated!

Thanks in advance.

Mohamed_Shahin · March 11, 2025, 1:46pm

Hi @Rohini_vaidya,

This is indeed a challenging in vector search. I would suggest considering Hybrid Search as an approach to improve accuracy, especially for queries involving negatives. Weaviate supports combining vector search with keyword-based search, which can help capture specific terms like “doesn’t” or “not”. You can adjust the balance between vector and keyword search using the alpha parameter.

Additionally, have a look at Tokenization config for the properties where the default is WORD, or you could leverage Field if that works better for your use case.

While you’ve already tuned HNSW parameters, it’s worth noting that the ef parameter is crucial for balancing search speed and quality. A higher ef value results in a more extensive search, enhancing accuracy but potentially slowing down the query.

Rohini_vaidya · March 16, 2025, 5:20am

Thank you @Mohamed_Shahin
I have tried both solution that you have suggested, but unfortunately it’s not working.

Still I am not able to achieve the accuracy for hybrid search.

Property(
name=“ABC”,
data_type=DataType.TEXT,
vectorize_property_name=True,
tokenization=Tokenization.WORD,
index_filterable=True,
index_searchable=True
)

Despite this configuration, I am still unable to achieve the desired accuracy.

Am I missing something? Are there any alternative approaches or tweaks I should try?

Any guidance would be greatly appreciated.

Thanks in advance!

Thank you in advance.

Mohamed_Shahin · March 17, 2025, 12:33pm

Hey @Rohini_vaidya

It’s definitely a challenging issue. I did some digging and come across that some models, like Snowflake models, are trained more on “hard negatives.” While not exactly the same as handling negation, they might perform better than the current Ada model. I would give that a try as an option.

There’s also a general issue with negations in search. Amazon published a paper on this challenge and how fine-tuning can improve performance:

Another potential way which could be attempted as first is to add a final reranker or RAG stage to adjust the results post-retrieval.

Hope this gives you a few ideas to explore!

Topic		Replies	Views
How to Improve the accuracy of vector search in weaviate General	2	2026	March 6, 2025
Advice Needed on Optimizing Vector Search in Weaviate Support	1	272	September 6, 2024
Assistance Needed to Improve Weaviate's Vector Search Performance General	2	440	March 6, 2025
Issue with Weaviate Hybrid Search (Alpha = 1) Not Returning Exact Match General	2	161	March 27, 2025
How do I improve hybrid search on Weaviate? Been poking at this for too long but haven't made much headway General	2	803	April 23, 2024

Issue with Vector Search Accuracy – Struggling with Negative Expressions

Issue:

What I’ve Tried:

Question:

Related topics