Description
Hey everyone,I’ve indexed around 1000 documents in Weaviate, and my pipeline works as follows:
- Perform a hybrid search to retrieve the top 100 chunks (not full documents).
- Rerank these 100 chunks using Cohere.
- Send the top 15 reranked chunks to an LLM for generation.
The issue I’m facing is this:If a document is very small—say, a one-liner—it often gets excluded from the final 15 chunks, even though it is part of the initial 100 retrieved by Weaviate. On the other hand, when the same piece of information exists as part of a larger document, it more reliably appears in the final top 15.
This is problematic because some of these short documents contain high-signal content that should ideally be prioritized in generation. I’m using a hybrid reranker setup—Cohere for semantic relevance and BM25 for keyword overlap—and both are supposed to be independent of document length during reranking.How can I mitigate this issue and ensure that short, high-signal chunks aren’t overlooked in the reranking phase? Any suggestions or best practices would be greatly appreciated!
Server Setup Information
- Weaviate Server Version: 1.29.0
- Deployment Method: WCS
- Multi Node? Number of Running Nodes: High Availability Cluster
- Client Language and Version: weaviate-client==4.11.3
- Multitenancy: Enabled