Hi there, I have questions related to two topics, both motivated by needing to significantly increase query throughput.
Description
1. Increase number of shards
(Based on the advice given here)
I did not set the number of shards when creating my collection, so I have just 1 shard currently. The documentation says the default number of shards though is 128.
Is 128 shards the recommended number to start with?
Can I update the number of shards on a collection that already exists?
2. Update HNSW vector index parameters
I am using a HNSW index. I would like to decrease dynamicEfMax and increase efConstruction. My understanding is that these changes would make queries faster but imports slower, a trade-off I can make.
How do I update dynamicEfMax and efConstruction for an existing collection?
Thanks @DudaNogueira . Do you think it’s worth experimenting with desiredCount in the shard config? If so, what values would you try? I need to stay on 1 node for now.
If it helps, here is additional context: I have ~8m vectors, each with 768 dimensions. The node has 24 virtual CPUs and 120GB RAM.
If you can only use 1 node, I don’t believe having more shards on that same node will be do any good for performance.
Much the opposite, as Weaviate will need to check multiple shards for a query in the same node.
If, by any chance, you are running a multitenant environment, then it could help. In fact, each tenant is a “shard” on it’s own. The upside of using tenants is that you can enable and disable.
On that topic, multi tenant, 1.25 will bring some really nice cool features, like tenant TTL: it can be auto deactivated after some time of inactivity and activated when a new event come it’s way and it is in a COLD/DEACTIVATED state.
In order to increase QPS, multi node is something to consider for you use case.
Other than that, it is about closely monitoring resource consumption, and making sure it has enough room to operate.