Description
Hi, I have several questions about how the vector cache works in weaviate.
We have a weaviate cluster of about 50 million objects, and the start time can take up to 30 minutes when we use HNSW_STARTUP_WAIT_FOR_VECTOR_CACHE=true
. However, looking at the logs, the shards load between 30 and 60 seconds. From what I understand, the loading time of the vector cache takes most of those 30 minutes. How come it’s so slow? Is there any way to improve the caching speed? Without cache, requests are slower (at least initially, and during the loading time), making the service not very usable, but after a certain point it seems that requests are faster even if the cache is not fully loaded. Is this the expected behavior?
Server Setup Information
- Weaviate Server Version: 1.31.0
- Deployment Method: k8s
- Multi Node?: 3
- Client Language and Version: NA
- Multitenancy?: No
Any additional Information
hi @jonathlela !!
Welcome to our community 
Indeed HNSW_STARTUP_WAIT_FOR_VECTOR_CACHE
will make the loading process synchronous, so it will probably take more time for the weaviate instance to be ready.
One env var that you can check if DISABLE_LAZY_LOAD_SHARDS
. It will by default be false
so it should be lazy loading the shards at startup.
Our team is already working on some ways to reduce this shard load by leveraging snapshots. I don’t have an ETA, but it should be soon.
Let me know if this helps!
Hi @DudaNogueira
Yes, we already disable lazy load of shards because if we enable it our first requests go on timeout.
I’ve also tried to use snapshots, but if it doesn’t change the loading time. When HNSW_STARTUP_WAIT_FOR_VECTOR_CACHE
is false, snapshots do improve loading time of the shard (less than a minute), but the cache vector still takes 30 minutes to be loaded. It doesn’t seem that snaphots help for the cache vector loading.