"I’m looking for a solution to set a global replication factor for vector indexes. The official documentation only mentions specifying this parameter when creating a vector index. I’m using an open-source software where vector indexes are created dynamically without specifying this parameter, and the database already has many vector indexes. Is there a way to add a global parameter and automatically replicate the existing vector indexes to match the desired replication factor?
The request load isn’t heavy. Currently, there’s only a single node, which isn’t robust for production use. I’m looking for a way to ensure high availability for Weaviate, not focused on high throughput.
hi @Zhenghua_Liu !!
Welcome to our community
Thank you very much! I have just realized that we do indeed have this feature, however it was undocumented
the environment variable you want is:
REPLICATION_MINIMUM_FACTOR
Once you set this for your cluster, it will always specify that value as the replication factor, unless you explicitly specify while creating a collection.
I will work on adding this to our docs.
Thanks a lot for asking this
I really appreciate the clarification about REPLICATION_MINIMUM_FACTOR
. I do have a follow-up question: If I set this value to 2
and then add a new Weaviate node to the cluster, will the existing collections automatically replicate to the new node?
hi @Zhenghua_Liu !!
It will not. Only new collections or tenants.
This feature is in our roadmap, and we call it dynamic scaling. With it you will be able to move shards around, helping this kind of scenario or when you want for example to drain a note.
Let me know if this helps!
Thanks!