Hi Guys,
This is more a documentation question than a problem.
I want to know how Weaviate balances memory and disk for the vector index. The docs mention several points but not in enough detail to understand exactly what is possible.
Imagine
- We have a schema/class with a single (non-vectorized) property - itemId
- We supply the vector for each class element
- No transformer/vectorizer is used
- We are running on a single Weaviate node/shard
As I understand it:
-
When we insert a new object (itemId + vector)
- Weaviate writes to WAL on DISK
- Weaviate inserts vector into index in RAM
- Weaviate returns success
-
So if an insert is successful, the data is safe no matter what happens next - as long as the DISK survives
-
When Weaviate starts it will check its WAL on DISK and will reconstruct the index in RAM before accepting requests
Questions:
- In order to run e.g. Nearest-Neighbour queries, must the entire index reside in RAM or is some form of DISK paging possible?
- If DISK paging is possible
- How efficient is it compared with the RAM only index? That is, does performance suffer and by how much?
- Does it still work when running Nearest-Neighbour searches on multiple schema/classes simultaneously?
- If DISK paging is not possible, and RAM is restricted are there strategies for managing an index too large for RAM?
Thanks for your help,
Greg
Asking a question tends to make answers bubble to the surface…
I found this:
HNSW is very fast, memory efficient, approach to similarity search. The memory cache only stores the highest layer instead of storing all of the data objects in the lowest layer. When the search moves from a higher layer to a lower one, HNSW only adds the data objects that are closest to the search query. This means HNSW uses a relatively small amount of memory compared to other search algorithms.
So as I understand it, DISK paging is not only possible, it is normal.
But as a followup question: imagine an enormous index and small RAM (or alternatively thousands of concurrently stored schemas), in this situation we could have 99.999… % of data on DISK and a tiny proportion in RAM. Would this be considered normal, or is there a recommended minimum percentage to keep in RAM for practical performance?
Please bear in mind that I’m not just asking for myself. If I’m confused there are probably hundreds like me out there who would benefit from a clarification - a little hand-holding for us non-data-scientists if you will
Thanks,
Greg
More answer bubbling:
During import set vectorCacheMaxObjects
high enough that all vectors can be held in memory. Each import requires multiple searches. Import performance drops drastically when there isn’t enough memory to hold all of the vectors in the cache.
So does this mean using DISK for lower layers is strongly discouraged because of the performance drop?
We have multi-terabyte data sets - there’s no way we can justify the cost of that much RAM.
What am I missing here?
Thanks,
Greg
Hi Greg!
Thanks for those questions and registering your journey
So yes, by default, Weaviate will keep data in memory. This is because vectorCacheMaxObjects
is set to trilions.
So if you reduce this setting to a number less then your total ammount of objects, you will have some objects in DISK and in RAM.
For this ammount of data, Product Quantization and playing with vectorCacheMaxObjects will help, along of course with having enough disk and memory and finding the balance between speed and memory.
Let me know if this helps. We would love to understand better your usecase.
Feel free to ping us in our slack channels
Thanks!