[Question] Starting Multiple local instances

Hi there,

I would like to start multiple independent local weaviate DB instances on a machine. The context is I’m building a web app using flask where each user should have a independent weaviate DB instance. There are also some resources shared across all users. Looks like the options I have are

  1. Start multiple docker containers and use connect_to_local to connect to those containers. The drawback of this option is that I will have to manually configure the docker containers in docker-compose.yml. Ideally, I’d like to programmatically start those instances. Plus, I am not sure connect_to_local can automatically find the right host and port.
  2. Start multiple embedded instances. It’s unclear from the doc what’s the difference between local and embedded instances, so I’m not sure if embedded instances can work for me.

Please advise on what’s the recommended solution.

Thanks!

Hi @Variable,

It depends on why you need independent Weaviate instances.

Is this:
A) for learning purposes, so that each user could run their own functions?
B) to separate user data, where each user’s data will be separated from others?

(A) Separate Databases

If A, then you should probably go the Docker Compose route.
You will need to provide the port and grpc_port for each instance, like this:

client = weaviate.connect_to_local(
    port=8080,
    grpc_port=50051,
)

(B) Multitenancy

If B, then you could use multitenancy, and make your collections multitenant.
With multitenant collections, each tenant has their own independent data.

For example, you could have a collection of Emails,

from weaviate.classes.config import Configure

client.collections.create(
    name="Emails",
    # Enable multi-tenancy on the new collection
    multi_tenancy_config=Configure.multi_tenancy(
        enabled=True,
        auto_tenant_creation=True
    )
)

Store user email content in a separate tenant:

emails = client.collections.("Emails")
emails_userA = emails.with_tenant("UserA")
emails_userA.data.insert({ ... }) # add email content for UserA

And query data for that tenant:

response = emails_userA.query.near_text("license update")