Possible bug with relativeScoreFusion

arielmoraes · July 17, 2024, 1:47am

Description

When executing hybrid searches with relativeScoreFusion, documents without any bm25 score can be scored better than the last bm25 result which receives a score of 0 after normalization.

IMHO the documents retrieved only via vector search must be assigned a value of 0 for the bm25 part before applying the fusion, by doing that the last bm25 result will rank up as the normalized score won’t be zero.

Is that a bug or is it by design?

Server Setup Information

Weaviate Server Version: 1.24.8
Deployment: docker
Multi Node? Number of Running Nodes: 1
Client Language and Version: api directly
Multitenancy?: no

Dirk · July 18, 2024, 4:47am

IMHO the documents retrieved only via vector search must be assigned a value of 0 for the bm25 part before applying the fusion, by doing that the last bm25 result will rank up as the normalized score won’t be zero.

relativeScoreFusion just normalizes linearly from [worst_score, best_score] to [0, 1]. The problem is that if a document does not have a BM25 score, we simply cannot scale it.

Is that a bug or is it by design?

It is more a limitation of the current design. In principle, you could compute the missing vector/Bm25 scores before fusion, but it is not trivial

arielmoraes · July 18, 2024, 10:40am

I know the algorithm is expected to have only one worst case, but before the final normalization we could assume a value of 0 for all the documents missing a BM25 score. Don’t know if it’s a oversight, but it could be done right after querying the vector results.

Dirk · July 18, 2024, 11:43am

Bm25 scores can be negative, this won’t work

Topic		Replies	Views
Error when using relativeScoreFusion as fusion_type in hybrid search Support	3	383	February 16, 2024
Hybrid search score calculation anomaly Support	3	452	January 30, 2024
Rerank with HybridFusion.RELATIVE_SCORE - How many are ranked? Support	3	352	May 8, 2024
Scores for Hybrid search Support technical	6	275	January 6, 2025
Hybrid search in weaviate Support	1	162	January 9, 2025

Possible bug with relativeScoreFusion

Description

Server Setup Information

Related topics