Understanding sharding in muti-tenant collections

Hi,

I’m trying to understand how sharding works in multi-tenant collections.

When I create a regular sharded collection (without multi-tenancy) that contains 300,000 data objects on a 3-node cluster, the data is sharded, and each node holds a shard with 100,000 objects.

My understanding is that this setup doesn’t work the same way with multi-tenancy. Specifically, if I have a multi-tenant collection and one of the tenants has 300,000 records, all 300,000 records would be stored on a single shard on one node, and there’s no way to distribute them evenly across multiple nodes.

Is my understanding correct?

Hello @izharg,

Welcome to our community and it’s great to have you here with us :slightly_smiling_face:

Yes, your understanding is correct.

In a multi-tenant setup, each tenant is allocated its own dedicated shard, meaning that all data for a specific tenant resides in a single shard on one node. If a tenant has 300,000 records, they would all be stored within that one shard.

Multi-tenant sharding is designed to keep each tenant’s data isolated within a single shard.

But if I have 3 tenants each with 100,000 records, in a 3 node cluster, will the shards for these 3 tenants be distributed across the 3 nodes? Tenant 1 shard on node 1, tenant 2 shard on node 2, …

Distributed across our 3-node cluster:

Node 1 :

• Tenant 1 (shard): 100,000 records

• Tenant 2 (shard): 100,000 records

• Tenant 3 (shard): 100,000 records

Node 2 (Same as Node 1):

• Tenant 1 (shard): 100,000 records

• Tenant 2 (shard): 100,000 records

• Tenant 3 (shard): 100,000 records

Node 3 (Same as Node1):

• Tenant 1 (shard): 100,000 records

• Tenant 2 (shard): 100,000 records

• Tenant 3 (shard): 100,000 records

1 Like

Ohh okay, understood. Thank you!

2 Likes