Best way to setup multiple weaviate databases on a single machine?

Description

I am trying to setup multiple weaviate instances with docker-compose on one machine. I need them to all have persistent data storage. I was going to modify the data storage directories and the ports. Still, I was wondering if there was a more secure way to ensure I can properly run multiple databases per machine. This is necessary for a few reasons as different projects have different vectorizer requirements and require data separation in other databases.

Server Setup Information

  • Weaviate Server Version: latest
  • Deployment Method: docker
  • Multi Node? Number of Running Nodes: 1
  • Client Language and Version: python

hi @blevlabs !

Welcome to our community! :hugs:

You can do that by:

1 - mapping different ports for each docker compose weaviate instance
2 - running all instances under a reverse proxy (like traefik), and mapping each server using a subdomain.

Let me know if this helps :slight_smile:

I see, so I would just need something like this (configured with different ports and file paths for saving:
(8080->8081)
(50051->50052)
(/var/lib/weaviate->/var/lib/weaviate_creator)

---
version: '3.4'
services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8081'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.24.10
    ports:
    - 8081:8081
    - 50052:50052
    volumes:
    - weaviate_data:/var/lib/weaviate_creator
    restart: on-failure:0
    environment:
      TRANSFORMERS_INFERENCE_API: 'http://t2v-transformers:8081'
      IMAGE_INFERENCE_API: 'http://i2v-neural:8081'
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate_creator'
      DEFAULT_VECTORIZER_MODULE: 'text2vec-transformers'
      ENABLE_MODULES: 'text2vec-transformers,img2vec-neural'
      CLUSTER_HOSTNAME: 'node1'
  t2v-transformers:
    image: cr.weaviate.io/semitechnologies/transformers-inference:mixedbread-ai-mxbai-embed-large-v1
    environment:
      ENABLE_CUDA: '1'
      NVIDIA_VISIBLE_DEVICES: 'all'
    deploy:
      resources:
        reservations:
          devices:
          - capabilities: 
            - 'gpu'
  i2v-neural:
    image: cr.weaviate.io/semitechnologies/img2vec-pytorch:resnet50
    environment:
      ENABLE_CUDA: '1'
      NVIDIA_VISIBLE_DEVICES: 'all'
    deploy:
      resources:
        reservations:
          devices:
          - capabilities: 
            - 'gpu'
volumes:
  weaviate_data:
...

Actually, strange. Even after changing the ports it still goes to 8080:

INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)

The first port is the one that needs to be changed.

So

8081:8080

Will map the port 8081 from you host machine to the port 8080 of your container.

Also, you can run your models on a separate docker compose, and create a network only for those container.

So every container that will use those models can also be added to that network.

Regarding the log, no problems.

This only indicates that the container is listening in port 8080.

When changing the ports in the docker compose you are only changing the map between host port and container port.

I see, so what would happen if I need to host multiple of the same weaviate database on the same machine? How can I ensure the same container URLs do not cross over between instances?

hi! Not sure I understood :thinking:

You can use the same machine to host different weaviate servers.

On that case, each server will listen on different ports for http and grpc. With that you can have multiple Weaviate instances running.

Docker isolates each service on it’s own network, so the only part that could overlap would be if you map the same port on your host to a container.

Let me know if this clarifies :slight_smile:

THanks!