Description
I’ve noticed that after several hours of inactivity (no queries or imports), the first query against my collection takes a very long time, upwards of 30 seconds. Then the second and third queries are a bit faster. Then, by about the 10th or 15th query, the query time is back to where I expect it to be. What explains this behavior? Is there a cache setting that could be misconfigured?
Information about my collection:
- 8 million objects
- vectors are 768 dimensions
Config for my HNSW index:
vector_index_config=Configure.VectorIndex.hnsw(
distance_metric=wc.VectorDistances.COSINE,
ef = -1,
dynamic_ef_min = 25,
dynamic_ef_max = 125,
ef_construction = 512
)
Note I did not set vectorCacheMaxObjects, so this parameter should have the default value of 1e12.
Server Setup Information
- Weaviate Server Version: 1.24
- Deployment Method: docker
- Multi Node? Number of Running Nodes: 1 node
- Client Language and Version: Python v4
- Amount of memory on my node: 120 GB, which should be plenty
Any additional Information
hi @cpwalker !
That’s interesting.
Do you have any memory, disk, cpu readings or logs from those events?
That could give us some leads.
I just ran a series of test queries just now while running top on the command line so I could watch memory usage. These were the first queries I ran after several hours of inactivity.
Before running any queries against my collection (baseline): 41.3 GB RAM used
available memory: 78 GB RAM
First query attempt:
used memory rises to 55 GB RAM. ~30 seconds pass, then I get this error:
WeaviateQueryError: Query call with protocol GRPC search failed with message Deadline Exceeded.
second query attempt:
used memory rises to 66 GB. ~30 seconds pass again and get this error WeaviateQueryError: Query call with protocol GRPC search failed with message Deadline Exceeded.
third query attempt:
used memory rises to 83.8 GB
queries successfully executes in ~4.25 seconds.
For the benefit of others, two things I’ve tried recently and will see if they work:
- Increase timeout on queries.
- Disabled swap in Linux.
1 Like
hello any fix you found for this?
hi @Sulaiman_Mutawalli !!
Welcome to our community 
You’re experiencing the vector cache warm-up behavior in Weaviate’s HNSW index implementation. After hours of inactivity, the vector cache becomes cold, and the first queries must load vectors from disk into memory, causing the 30-second delay reported.
What’s Happening
When Weaviate starts or after inactivity, the HNSW index performs cache prefilling to load frequently-accessed vectors into memory
So during initial queries:
- Query 1: Cache is empty, vectors load from disk (~30 seconds)
- Queries 2-10: Cache partially filled, some disk reads still needed
- Query 15+: Most frequently-accessed vectors are cached, performance normalizes
Solution
Set vectorCacheMaxObjects to a realistic value based on your available RAM. For example, for OP’s scenario, he can dedicate 8GB to the vector cache:
vector_index_config=Configure.VectorIndex.hnsw(
distance_metric=wc.VectorDistances.COSINE,
ef=-1,
dynamic_ef_min=25,
dynamic_ef_max=125,
ef_construction=512,
vector_cache_max_objects=2_500_000 # ~7.5GB for 768-dim vectors
)
This limits cache size while still covering ~31% of your dataset. The HNSW algorithm naturally accesses higher-layer nodes more frequently, so caching the most-accessed vectors provides good performance.
Let me know if this helps!
Thanks, here is what fixed my issue
I was using v 1.29, today morning shifted to v1.34 I am not facing any issue it’s consistent
1 Like