Storing multiple vectors per doc for hybrid search

Hi,

In my database (few millions of legal docs) we have short documents (single paragraph) and very long ones (equivalent of 10 pages). Each of them have titles (descriptive of its content)

I want to setup hybrid search with weaviate. For embeddings i need to split long docs in short ones, and append the title to each of them.

For BM25 I want to keep all documents intacts for 2 reasons : first, very purpose of bm25 (vs tf idf) is to take length into account of relevancy statistics.
Also repeating title (which we usually put in a dedicated field) many times for some documents and not for other will modify token statistics and we expect it to make the search less relevant (we plan a real xp with some measures, for now it s just an expectation).

Question is : is there a way to have several vectors for each document and perform a hybrid search? Or are we force to have 2 different indexes and do 2 search and perform some reconciliation after the retrieval step?

my understanding is that BM25 doesnt work with vectors at all… but not sure how its implemented in Weaviate. BM25 imo is similar to a keyword search (SELECT * from text WHERE …)

There is a working hybrid search feature, with support of 2 fusions strategies. Still it only works with 1 doc == 1 vector and do not match more complex situations. We are testing Vespa which supports it FWIW…