The meaning of 'replicas' in the Helm configureation's values.yaml file

Hello.
I have a question.

I configured the k8s cluster by setting the values.yaml file of Helm as follows.


replicas: 3
updateStrategy:
type: RollingUpdate
resources: {}

I’ve checked that when configured this way, 3 pods and 3 Persistent Volumes(PVs) are created.
I would like to know whether the numbner specified in the replicas setting is for shards or replicas?
If I import data with this configuration, is it sharded into three or replicated into three?

hi @Rio! Welcome to our community :hugs:

in our Helm, whatever number you specify as replicas, are the number of nodes that k8s will spin up for your cluster.

This will, in fact, activate all the replication architecture, as described here:

Let me know if this helps :slight_smile:

Thanks!

1 Like

Thank you for your response.
I understand that the replicas setting in the helm values file is not directly related to the number of replicas.
The number of replicas is set through ‘replication_config’ when creating a collection. (Replication | Weaviate - Vector Database)
If so, with the number of replicas in the helm values file set to 3, and if I create a collection without specifying a separate ‘replication_config’ and import data, will it be distributed and stored in three nodes? Can this be called sharding?

Hi!

When you set replicas to 3 in your helm chart, Kubernetes will spin 3 Weaviate servers, and make them a cluster with 3 nodes.

Now, you can have a collection with replication factor as one, this means, it will only have all objects in one of your nodes:

If you create a collection on a cluster that has 3 nodes, and specify the replication factor to 3, they will have 3 copies of all data on those 3 nodes.

By default, the replication factor is one. So if you want your collection with a different replication factor, you need to explicitly define that.

This is not sharding. Check here for the difference:

Let me know if this is clear!

Thanks!

1 Like

Hi. Dear @DudaNogueira

Thank you very much for your response.
Before asking the above question
I’ve created a 3-node cluster and set the number of replicas to 3
Then I imported data into the cluster with the default option.
I have checked that the storage of all 3 nodes is increasing.
So, I wondered if it was sharding.
Now In your reply, I know that this is not a shading.

I also understand how to use the replication feature.
However, I am not sure exactly how to use sharding.

I found a way to use shading in the document below.

It’s described as follows.
“desiredCount” : defaults to the number of nodes in the cluster.

If so, is it automatically sharded by the number of nodes in the cluster when I do not explicitly specify shardingConfig?

Could you please explain it to me?

Thank you again for your help!

1 Like

Hi! That is right.

By default, Weaviate will set the number of shards as the number of nodes in your cluster. If you want more shards, you can set the desiredCount to a higher number.

1 Like

Thank you again and Have a wonderful day!

1 Like