Description
How do I configure a collection to use a transformers vectorizer (specifically, for a bi-encoder setup)? I used the Configurator to create a docker-compose.yml file with separate encoders for Queries and Passages, which is standard for Dense Passage Retrievers. Thus far I haven’t found any example for defining a transformers vectorizer, and haven’t been able to find out what the options are from. All examples I’ve seen just use wc.Configure.Vectorizer.text2vec_openai().
Relevant pieces of the docker-compose are:
environment:
TRANSFORMERS_PASSAGE_INFERENCE_API: 'http://t2v-transformers-passage:8080'
TRANSFORMERS_QUERY_INFERENCE_API: 'http://t2v-transformers-query:8080'
...
DEFAULT_VECTORIZER_MODULE: 'text2vec-transformers'
ENABLE_MODULES: 'text2vec-transformers,reranker-transformers'
t2v-transformers-passage:
image: cr.weaviate.io/semitechnologies/transformers-inference:facebook-dpr-ctx_encoder-single-nq-base
...
t2v-transformers-query:
image: cr.weaviate.io/semitechnologies/transformers-inference:facebook-dpr-question_encoder-single-nq-base
...
import weaviate
import weaviate.classes as wvc
client = weaviate.connect_to_custom(
http_host="localhost",
http_port="8080",
http_secure=False,
grpc_host="localhost",
grpc_port="50051",
grpc_secure=False,
)
try:
collection_spacecom = client.collections.create(
name="mycollection",
vectorizer_config=wvc.config.Configure.Vectorizer.?????
)
finally:
client.close() # Ensure the connection is closed
Server Setup Information
- Weaviate Server Version: Docker image: cr.weaviate.io/semitechnologies/weaviate:1.24.2
- Deployment Method: Docker compose
- Multi Node? Number of Running Nodes:
- Client Language and Version: Python client v4