Description
I have a local Docker image of a specific transformers embedding model, and I am spinning up a Docker container from it. I specify the necessary configurations in docker-compose.yml file. I am following the official tutorial:
link1https://docs.weaviate.io/weaviate/model-providers/transformers/embeddings
link2https://docs.weaviate.io/weaviate/model-providers/transformers/embeddings-custom-image#build-a-custom-transformers-model-image
Here is my docker-compose.yml file:
-–
services:
weaviate:
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image: cr.weaviate.io/semitechnologies/weaviate:1.32.9
ports:
- 8080:8080
- 50051:50051
volumes:
- weaviate_data:/var/lib/weaviate
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: ‘true’
PERSISTENCE_DATA_PATH: ‘/var/lib/weaviate’
DEFAULT_VECTORIZER_MODULE: text2vec-transformers
ENABLE_MODULES: text2vec-transformers
TRANSFORMERS_INFERENCE_API: http://text2vec-transformers:8080
CLUSTER_HOSTNAME: ‘node1’
text2vec-transformers:
image: medembed-inference
ports:
- 8000:8080
deploy:
resources:
reservations:
devices:
- driver: nvidia
capabilities: [gpu]
environment:
ENABLE_CUDA: 1
NVIDIA_VISIBLE_DEVICES: all
NVIDIA_DRIVER_CAPABILITIES: all
volumes:
weaviate_data:
…
I get two running Docker containers after I run docker-compose up -d command on the above docker-compose.yml file, namely, one that runs Weaviate server, and the other that runs my local embedding model.
I can test weaviate server by running this command:
curl localhost:8080
and the embedding model by running this command:
curl localhost:8000/vectors -H ‘Content-Type: application/json’ -d ‘{“text”: “foo bar”}’
I do get a vector representation of the text through this command.
However, my question is:
When I create a collection and insert objects into it, the vectors I’m getting are different than the ones I get by using the curl command on localhost:8000, which is where my embedder is running. So, it makes me wonder if I’m using my local embedder. That is, am I using the correct value for the environment variable TRANSFORMERS_INFERENCE_API? Since I am locally hosting my embedder, should I use localhost:8000 or, host.docker.internal:8000? I’ve used both of them and I get the error Connection reset by peer and the weaviate server does not start. I also tried to override TRANSFORMERS_INFERENCE_API by setting InferenceUrl parameter in vector_config to localhost:8000 and host.docker.internal:8000, but I get the error “connection refused”.
Any help would be greatly appreciated. Thanks.
Server Setup Information
- Weaviate Server Version: 1.32.9
- Deployment Method: Docker
- Multi Node? Number of Running Nodes: 1
- Client Language and Version: Python and weaviate client version is 4.16.10
- Multitenancy?: No