Increase number of shards and update HNSW vector index parameters

Hi there, I have questions related to two topics, both motivated by needing to significantly increase query throughput.


1. Increase number of shards

(Based on the advice given here)
I did not set the number of shards when creating my collection, so I have just 1 shard currently. The documentation says the default number of shards though is 128.
Is 128 shards the recommended number to start with?
Can I update the number of shards on a collection that already exists?

2. Update HNSW vector index parameters

I am using a HNSW index. I would like to decrease dynamicEfMax and increase efConstruction. My understanding is that these changes would make queries faster but imports slower, a trade-off I can make.

How do I update dynamicEfMax and efConstruction for an existing collection?

Server Setup Information

  • Weaviate Server Version: 1.24
  • Deployment Method: Docker
  • Multi Node? Number of Running Nodes: 1 node
  • Client Language and Version: Python v4

hi @cpwalker !

the 128 value is for the virtualPerPhysical shards.

efConstruction is not a mutable configuration, as per the docs on mutability.

So for that one you will need to reindex your collections.

dynamicEfMax, on the other hand, is mutable.

If you want to increase query throughput , your best path is leveraging replication and using a multi node deployment.

You can read more on that here:

Let me know if this helps.


Thanks @DudaNogueira . Do you think it’s worth experimenting with desiredCount in the shard config? If so, what values would you try? I need to stay on 1 node for now.

If it helps, here is additional context: I have ~8m vectors, each with 768 dimensions. The node has 24 virtual CPUs and 120GB RAM.

If you can only use 1 node, I don’t believe having more shards on that same node will be do any good for performance.

Much the opposite, as Weaviate will need to check multiple shards for a query in the same node. :thinking:

If, by any chance, you are running a multitenant environment, then it could help. In fact, each tenant is a “shard” on it’s own. The upside of using tenants is that you can enable and disable.

On that topic, multi tenant, 1.25 will bring some really nice cool features, like tenant TTL: it can be auto deactivated after some time of inactivity and activated when a new event come it’s way and it is in a COLD/DEACTIVATED state.

In order to increase QPS, multi node is something to consider for you use case.

Other than that, it is about closely monitoring resource consumption, and making sure it has enough room to operate.

Thank you this is very helpful.

1 Like