How do virtual shards work when upscaling?

Say I have 1 weaviate instance and I configure my class to have 10 shards. What happens when I start adding more weaviate instances to my cluster? Are they moved around automatically? Are there API methods I can use to move around shards manually?

Also; are there recommendations for maximum number of vector objects per shard? I’m noticing my query time is reducing as I add more data, but I’m not sure how value-able it is to create a new class with a higher shard count.

1 Like

Hi @andersfylling, great questions!

What happens when I start adding more weaviate instances to my cluster? Are they moved around automatically? Are there API methods I can use to move around shards manually?

The answer to both is not yet, but it’s on the roadmap. Please leave an upvote here, it’ll help us with prioritization. At the moment, shards are “stuck” on one node once they are assigned to it. Both manual and automatic shard-movement is planned though.

Also; are there recommendations for maximum number of vector objects per shard?

Typically, you should never need more shards than you have nodes (you called them “instances”, we typically call them “nodes” in our comms). There are a few exceptions, though. If you have a multi-tenancy case, for example, Weaviate will create one shard per tenant in the background. The reasons why are outlined in this blog post.

Hi @etiennedi , as you mentioned that currently shards can’t be move across nodes. Then what to do if my nodes get full and i try to add new node.
As of now , the exiting class with more shards cannot be moved to new shards then how to achieve Horizontal scaling for existing classes ?