What it says on the tin Love these models but the lack of support means I’m tempted to go around weaviate which defeats a part of the promise of the product for me!
3 models on my wishlist:
hi @Baptiste_Cumin1 !!
Welcome to our community!
Is this voyage embedding the voyage-multilingual-2
one? If that’s the case, good news: It is already supported.
Here is how:
import weaviate
import os
from weaviate import classes as wvc
client = weaviate.connect_to_local(
headers={
"X-VoyageAI-Api-Key": os.getenv("VOYAGEAI_APIKEY"),
}
)
client.collections.delete("Voyage")
collection = client.collections.create(
"Voyage",
vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_voyageai(
model="voyage-multilingual-2"
),
properties=[
wvc.config.Property(name="some_text", data_type=wvc.config.DataType.TEXT)
]
)
collection.data.insert(
{
"text": "This is an example"
}
)
as for the Jina, indeed, we only have support for the text2vec:
I have raised this with our team and will keep you posted.
Thanks for using Weaviate!
Also, there is a PR pending review for the reranker from JinaAI:
weaviate:main
← MudNam:jinaai-reranker
opened 11:21PM - 21 Jun 24 UTC
### What's being changed:
This PR adds a possibility of using JinaAi Reranker w… ith different RAG as:
Research was done to find out what reranker + Embeddings models work best using ;
Firstly many seed funded gen ai startups re confused which embeddings + reranker or rag to use so let simplify their problem if we can by giving some data
Embedding Models: OpenAI, CohereAI (v2.0/v3.0), Jina (small/base), BAAI/bge-large-en, Google PaLM, Voyage
Rerankers: CohereAI, bge-reranker-base, bge-reranker-large.
Hit Rate: Measures the fraction of queries where the correct answer is in the top-k retrieved documents.
Mean Reciprocal Rank (MRR): Evaluates accuracy based on the rank of the highest-placed relevant document.
These rerankers greatly refines search results improve hit rates and MRRs across embeddings
Key Findings were-(1)-OpenaAI works best with cohere and bge-rerankers
(2)-JinaAi-Great with bge-reranker-large
But after JinaAi too launched there Jina Reranker (jina-reranker-v1-base-en) it achieves the highest among its peers with 0.8553 hit rate and 0.7091 mrr though the difference is not that much but still customers need the best for their companies
No Reranker jina-reranker bge--base bce-rer-base_v1 cohere-reranker
Embedding model Hit Rate MRR Hit Rate MRR Hit Rate MRR Hit Rate MRR Hit Rate MRR
jina-v2-base-en 0.8053 0.5156 0.8737 0.7229 0.8368 0.6568 0.8737 0.7007 0.8842 0.7008
bge-base-en-v1.5 0.7842 0.5183 0.8368 0.6895 0.8158 0.6586 0.8316 0.6843 0.8368 0.6739
bce-embedding-base_v1 0.8526 0.5988 0.8895 0.7346 0.8684 0.6927 0.9157 0.7379 0.9158 0.7296
CohereV3-en 0.7211 0.4900 0.8211 0.6894 0.8000 0.6285 0.8263 0.6855 0.8316 0.6710
Average 0.7908 0.5307 0.8553 0.7091 0.8303 0.6592 0.8618 0.7021 0.8671 0.6938
![1_tCBbIjV_jLZP1AKLTX7rAw](https://github.com/weaviate/weaviate/assets/112192260/b954e859-a89c-42d3-b44d-3b5057d6595d)
### Review checklist
- [x] Documentation has been updated, if necessary. Link to changed documentation:
- [x] Chaos pipeline run or not necessary. Link to pipeline:
- [x] All new code is covered by tests where it is reasonable.
- [x] Performance tests have been run or not necessary.
Hi!
We have a follow up!
weaviate:main
← wregen:reranker-jina
opened 03:37AM - 22 Jul 24 UTC
### What's being changed:
This PR adds a possibility of using JinaAi Reranker A… PI at https://api.jina.ai/v1/rerank.
### Review checklist
- [x] Documentation has been updated, if necessary. Link to changed documentation:
- [x] Chaos pipeline run or not necessary. Link to pipeline:
- [x] All new code is covered by tests where it is reasonable.
- [x] Performance tests have been run or not necessary.
Jina Reranker was added in Release v1.26.1 - Hybrid search performance Fix, Tenants create API Fix, New JinaAI Reranker module · weaviate/weaviate · GitHub
Thanks for using Weaviate!