After applying the reranker, you can access the rerank_score, like so:
query = collection.query.bm25(
query="Animals",
rerank=wvc.query.Rerank(
prop="question",
query="big"
),
return_metadata=wvc.query.MetadataQuery(certainty=True, score=True)
)
for i in query.objects:
print(i.metadata.rerank_score, i.properties.get("question"))
0.01665704 The gavial looks very much like a crocodile except for this bodily feature
0.01395571 Weighing around a ton, the eland is the largest species of this animal in Africa
0.005099818 Heaviest of all poisonous snakes is this North American rattlesnake
0.0005442133 It's the only living mammal in the order Proboseidea
one quick followup,
the reranker model eg rerank-english-v2.0
need this be linked to our embedder or embed-english-v3.0 or is that completely separate?
If I want to retrieve 10 documents using vector/hybrid/keyword search (using limit = 10), how do I specify I want to rerank these documents in cross-encoder (reranker), and retrieve the top 5?
From my understanding, you can still do post-retrieval tweaking using the vector parameter, even with reranking present. This applies to post-retrieval, pre-rerank stage, yes?
I was not able to find an option to use a generator with this chain, however, since we can’t specify the number of most relevant documents to pick after re-ranking. Is there a way to do that?
Eg. Search → Pick 25 → Rerank → Pick Top 10 → Generate based on top 10?
Currently, I have to move out of the weaviate framework and query my language model independently with the reranked context even though I am using the language model as a generator in my weaviate pipeline.