Hybrid search explanation explanation :)

rjalex · May 9, 2024, 7:45am

I am testing my weaviate collection with hybrid searches performed with several different embedding models (using named vectors) and varying alpha values through a new cool interface:

Can anyone help me better understand the explanation that comes back along with the results? Especially the first one seems confusing.

PS This frontend was developed by a super cool guy that I cannot recommend enough. If you need front end developments for your weaviate based project you can contact him at cesarnml@outlook.com

DudaNogueira · May 9, 2024, 7:22pm

Ciao amico! Come stai?

edit:
By default, Weaviate will do a ~~Ranked Fusion~~ Relative Score Fusion see here

We have a nice blog article on this, for instance:

The rankedFusion algorithm is the original hybrid fusion algorithm that has
been available since the launch of hybrid search in Weaviate.

In this algorithm, each object is scored according to its position in the results for the given search, starting from the highest score for the top-ranked object and decreasing down the order. The total score is calculated by adding these rank-based scores from the vector and keyword searches

There is also the Relative Score. There is a nice explanation in the source code:

// FusionRelativeScore uses the relative differences in the scores from keyword and vector search to combine the
// results. This method retains more information than ranked fusion and should result in better results.
//
// The scores from each result are normalized between 0 and 1, e.g. the maximum score becomes 1 and the minimum 0 and the
// other scores are in between, keeping their relative distance to the other scores.
// Example:
//
// Input score = [1, 8, 6, 11] => [0, 0.7, 0.5, 1]
//
// The normalized scores are then combined using their respective weight and the combined scores are sorted

Also from that blog article:

In contrast to rankedFusion , however, relativeScoreFusion derives each objects score by normalizing the metrics output by the vector search and keyword search respectively. The highest value becomes 1, the lowest value becomes 0, and others end up in between according to this scale. The total score is thus calculated by a scaled sum of normalized vector similarity and normalized BM25 score.

Also notice that the Result Set keyword part of the explanation will only appear if there is a key word match against your objects.

Let me know if this helps

Thanks!

Dirk · May 9, 2024, 10:52pm

Since 1.24 the default is relative score fusion

rjalex · May 10, 2024, 6:11am

Thanks a lot my friends.

DudaNogueira · May 10, 2024, 6:09pm

Great! I have created a PR in our docs here to reflect that

Thanks @Dirk !!

Edit: PR merged!

Topic		Replies	Views
Advice Needed on Optimizing Vector Search in Weaviate Support	1	281	September 6, 2024
Wrong retrieval results with near_vector and hybrid search Support	1	138	June 27, 2024
How do I improve hybrid search on Weaviate? Been poking at this for too long but haven't made much headway General	2	818	April 23, 2024
Hybrid search with embedding outside the database Support	1	157	September 16, 2024
Weaviate HybridSearch explainScore General technical	1	345	March 17, 2025

Hybrid search explanation explanation :)

Related topics