Using Multiple Nodes with Tenancy

dhanshew72 · July 19, 2024, 6:13pm

Description

How do I know my setup is using multiple nodes with tenancy or how would I verify this state? I’ve been running a 2 node cluster playing around with the setup by loading data. However, I’m not what I would think would be the right things to ensure that this setup is working and wanted some guidance.

When I hit the endpoint /v1/nodes/<MY_COLLECTION_NAME>

{"nodes":[{"batchStats":{"queueLength":0,"ratePerSecond":3},"gitHash":"1ea5766","name":"node0","shards":null,"status":"HEALTHY","version":"1.25.7"},{"batchStats":{"queueLength":0,"ratePerSecond":0},"gitHash":"1ea5766","name":"node1","shards":null,"status":"HEALTHY","version":"1.25.7"}]}

I have 2 tenants loaded up at the moment so I’d expected to see at least a value for shards for node0. The only logs I see that confirms the nodes are connected is debug logs:

"msg":" memberlist: Initiating push/pull sync with: node0

These logs are just repeating itself.

However, I’m not clear if these nodes are being used to store tenants individually or if I need to change my setup in some way to search across multiple ones. I also am not near to filling up a single node. The only thing I’ve seen is when I was developing this I would get error messages if my node1 wasn’t reachable.

Any advice on this? I just want to prove that multiple nodes will be used loading data for multiple tenants.

Server Setup Information

Weaviate Server Version: 1.25.7
Deployment Method: Docker on AWS ECS
Multi Node? Number of Running Nodes: 2
Client Language and Version: 4.6.5
Multitenancy?: Yes.

DudaNogueira · July 26, 2024, 2:58pm

hi @dhanshew72 !!

Sorry for the delay here.

Missed this one

Were you able to figure this out?

Weaviate should distribute the tenants across different available nodes.

Here is a test I did using latest 1.26.1 version with 3 nodes:

from weaviate import classes as wvc
client.collections.delete("MyMTCollection")
collection = client.collections.create(
    "MyMTCollection",
    multi_tenancy_config=wvc.config.Configure.multi_tenancy(enabled=True, auto_tenant_activation=True, auto_tenant_creation=True),
    vectorizer_config=wvc.config.Configure.Vectorizer.none()
)

now let’s check we have 0 shards yet:

for node in client.cluster.nodes(output="verbose"):
    print(node.name, len(node.shards))

outputs:

weaviate-0 0
weaviate-1 0
weaviate-2 0

Now let’s add 100 tenants with some sample data

for i in range(100):
    tenant_data = f"T{i}"
    collection.with_tenant(tenant_data).data.insert({"text": tenant_data})

This is how it is distributed (it varies for every run) after adding the tenants and content:

weaviate-0 40
weaviate-1 28
weaviate-2 32

Let me know if that helps.

THanks!

dhanshew72 · July 26, 2024, 4:30pm

Apologies, I should have reported back. I got this working now.

jinx · February 25, 2025, 5:14pm

Will the tenants be distributed such that each node has roughly the same amount of data?

dhanshew72 · February 25, 2025, 5:23pm

No, they’re based on the tenant size and stay on the same node. Each tenant is considered a shard at least on 1.24

DudaNogueira · February 25, 2025, 6:39pm

that is also true for 1.29

jinx · February 26, 2025, 2:50am

I am sorry, let me rephrase. I am running a weaviate cluster on kubernetes. I have multiple weaviate pods running across multiple kubernetes nodes. Let’s assume I have 4 pods running across 4 nodes. And each pod was allocated 8 GB of memory. So if one pod can hold n vectors, I know that my weaviate cluster can support roughly 4*n vectors. Correct?

Now instead of specifying sharding config for the collection, if I make the collection multi-tenant, can I assume that my cluster will still be able to support 4*n vectors? Does Weaviate distribute tenants evenly across all pods, or do I need to manage this manually?

DudaNogueira · February 27, 2025, 5:33pm

hi @jinx !

That’s correct. This is considering you have replication factor of 4 (so each object will be replicated 4 times across the cluster). Now each collection/tenant will have shards spreaded on all 4 nodes.

If you have 4 nodes, and you have a replication factor of 3, Weaviate will allocate shards on the node that is use the least resources.

We do not have the feature of shard movement. This is planned in experimental mode for Weaviate 1.30.

Let me know if that helps!

Thanks!

jinx · March 3, 2025, 3:57am

Okay, understood. Thank you!

Topic		Replies	Views
Data replication issue Support	6	291	December 3, 2024
How do I scale in/out statefulset in weaviate Support technical	1	149	December 23, 2024
Does multi-tenant support is there data isolation available between Tenants ? Can we Map Tenants to Specific nodes Support python	3	180	July 2, 2024
Skewed dataset with multi-tenant collection Support	2	327	February 29, 2024
Any Plans to Support Sharding for Individual Tenants? General	1	113	June 12, 2025

Using Multiple Nodes with Tenancy

Description

Server Setup Information

Related topics