Can we have multiple container instance of t2v transformers for embedding so that we are able to process a large dataset

Description

can we have multiple container instance of t2v transformers for embedding so that we are able to process a large dataset(easily and faster). I am using regular weaviate docker script with t2v transformer and reranker. can anybody know how to deal with this situation or anybody who have worked previously

Hey @Dev_Choudhary

Yes you can simply run multiple t2v-transformers inference containers and let Weaviate round-robin across them.

Docker-Compose sketch:

services: weaviate:
image: semitechnologies/weaviate:1.32.5
environment: MODULES: "text2vec-transformers" TRANSFORMERS_INFERENCE_API: "http://t2v-a:8080,http://t2v-b:8080"
t2v-a:
image: semitechnologies/transformers-inference:sentence-transformers-all-MiniLM-L6-v2 t2v-b:
image: semitechnologies/transformers-inference:sentence-transformers-all-MiniLM-L6-v2

  • Add more t2v-* services as needed.

  • On Kubernetes, scale a Deployment and point TRANSFORMERS_INFERENCE_API at the service URL.

Weaviate load-balances calls, so bulk imports speed up linearly until network or disk becomes the bottleneck.