SPLADE: Any plans on Integration?

Hello weaviate Team.

I’ve been using Weaviate for almost a year now, I’m using the Hybrid Search, BM25, with a voyage embedding model.

I was recently looking at SPLADE, and was wondering if there are any plans to integrate this into the Keyword search, as a replacement, or alongside BM25.
I’d like to be able to do a Hybrid search with the Dense vectors + SPLADE + BM25, this is my dream scenario. But just Dense vectors + SPLADE would also make me happy.

From the knowledge cards, I saw you guys were already aware of SPLADE: Sparse Vectors - Weaviate Knowledge Cards

SPLADE-v3 on Huggingface: naver/splade-v3 · Hugging Face

Thanks!

Hi @LiamVDB ,

We haven’t prioritized SPLADE or more generally sparse models for the following reasons:

  1. There is still are not a huge variety of sparse models available, none of the major embedding providers have sparse models, additionally sparse models have much weaker scores for retrieval benchmarks. For example compare the BEIR scores in the SPLADEv3 model vs the current leaderboard MTEB Leaderboard - a Hugging Face Space by mteb .
  2. A sparse model like SPLADEv3 can beat BM25 on its own but there is a lack of research around what happens when combining with good dense models in a hybrid setting.
  3. Practically we find BM25 pairs very well with dense vector search in that it handles out-of-distribution tokens / keywords / identifiers well while normal dense vector search handles semantic queries. A part of this is how BM25 adapts to collections having different document frequencies. Conversely alternative solutions using sparse indexes for BM25 have had problems in how they can use non-static document frequencies.
  4. Adding a sparse model to an existing dense + bm25 index will necessarily add latency and complexity.

For the above reasons we have focused more recently on multi-vector and making BM25 style queries faster BlockMax WAND: How Weaviate Achieved 10x Faster Keyword Search | Weaviate .

However saying that we are open to adding support for SPLADE type models to Weaviate in the future but I can’t give exact dates or plans.

2 Likes