Description
can we have multiple container instance of t2v transformers for embedding so that we are able to process a large dataset(easily and faster). I am using regular weaviate docker script with t2v transformer and reranker. can anybody know how to deal with this situation or anybody who have worked previously
Hey @Dev_Choudhary
Yes you can simply run multiple t2v-transformers
inference containers and let Weaviate round-robin across them.
Docker-Compose sketch:
services: weaviate:
image: semitechnologies/weaviate:1.32.5
environment: MODULES: "text2vec-transformers" TRANSFORMERS_INFERENCE_API: "http://t2v-a:8080,http://t2v-b:8080"
t2v-a:
image: semitechnologies/transformers-inference:sentence-transformers-all-MiniLM-L6-v2 t2v-b:
image: semitechnologies/transformers-inference:sentence-transformers-all-MiniLM-L6-v2
-
Add more t2v-*
services as needed.
-
On Kubernetes, scale a Deployment and point TRANSFORMERS_INFERENCE_API
at the service URL.
Weaviate load-balances calls, so bulk imports speed up linearly until network or disk becomes the bottleneck.