How does vector cache work?

jonathlela · July 10, 2025, 8:49am

Description

Hi, I have several questions about how the vector cache works in weaviate.
We have a weaviate cluster of about 50 million objects, and the start time can take up to 30 minutes when we use HNSW_STARTUP_WAIT_FOR_VECTOR_CACHE=true. However, looking at the logs, the shards load between 30 and 60 seconds. From what I understand, the loading time of the vector cache takes most of those 30 minutes. How come it’s so slow? Is there any way to improve the caching speed? Without cache, requests are slower (at least initially, and during the loading time), making the service not very usable, but after a certain point it seems that requests are faster even if the cache is not fully loaded. Is this the expected behavior?

Server Setup Information

Weaviate Server Version: 1.31.0
Deployment Method: k8s
Multi Node?: 3
Client Language and Version: NA
Multitenancy?: No

Any additional Information

DudaNogueira · July 10, 2025, 2:48pm

hi @jonathlela !!

Welcome to our community

Indeed HNSW_STARTUP_WAIT_FOR_VECTOR_CACHE will make the loading process synchronous, so it will probably take more time for the weaviate instance to be ready.

One env var that you can check if DISABLE_LAZY_LOAD_SHARDS. It will by default be false so it should be lazy loading the shards at startup.

Our team is already working on some ways to reduce this shard load by leveraging snapshots. I don’t have an ETA, but it should be soon.

Let me know if this helps!

jonathlela · July 10, 2025, 3:57pm

Hi @DudaNogueira

Yes, we already disable lazy load of shards because if we enable it our first requests go on timeout.

I’ve also tried to use snapshots, but if it doesn’t change the loading time. When HNSW_STARTUP_WAIT_FOR_VECTOR_CACHE is false, snapshots do improve loading time of the shard (less than a minute), but the cache vector still takes 30 minutes to be loaded. It doesn’t seem that snaphots help for the cache vector loading.

Topic		Replies	Views
Weaviate cluster is very unstable (1.29.2) Support	8	353	April 9, 2025
[Question] About vectorCacheMaxObjects mechanism Support technical	1	126	November 26, 2024
I am getting hnsw_vector_cache_prefill frequently Support	8	265	November 15, 2024
Optimizing Object Import Performance in Large Weaviate Classes with HSNW Indexing Support developer-experience	1	521	February 9, 2024
Query response time is very slow after several hours of inactivity Support python	3	363	May 30, 2024

How does vector cache work?

Description

Server Setup Information

Any additional Information

Related topics