Questions on sharding

Weaviate Version: 1.22.5
Setup: Single Docker Container
Objects: 2 Mils+, Multi-Tenant (100s of them)

I saw an option for horizontal scaling in weaviate docs but confused about how actually to enable it. I have a single docker container running with all the data and I want to transfer it to an EKS cluster. I saw an option for replicas in the helm chart and raised it to 2. I restored the data using /restore endpoint from the docker container to EKS cluster but only one of the two replicas received all the data. Now my understanding is by default it doesn’t shard them into two of these automatically. But all the options I see for enabling sharding seem to point I have to import it by myself using a Python script etc. (Is this a correct assumption? or is there an easier way?)

What if I want to increase the number of replicas as in future the data grows? (Is increasing the number of replicas in Helm enough or do I have to do something else?)

hi @kubre !

That’s right. If you want to increase replication or sharding in different nodes, a new cluster will need to be created and a proper collection with the desired configuration created and the data migrated.

For next releases we will be releasing features that help this, like using RAFT consensus and improving dynamic sharding. This will help this kind of operations:

Thanks!