We want to better configure our weaviate cluster to achieve best performance. Here is such configuration in weaviate environments as PERSISTENCE_HNSW_MAX_LOG_SIZE
and we want to know how to configure it depending on our vector size (512).
From the official documentation we get these explanation:
Database parameters for HNSW
Note that some database-level parameters are available to configure HNSW indexing behavior.
PERSISTENCE_HNSW_MAX_LOG_SIZE
is a database-level parameter that sets the maximum size of the HNSW write-ahead-log. The default value is500MiB
.
Increase this value to improve efficiency of the compaction process, but be aware that this will increase the memory usage of the database. Conversely, decreasing this value will reduce memory usage but may slow down the compaction process.
Preferably, the PERSISTENCE_HNSW_MAX_LOG_SIZE
should set to a value close to the size of the HNSW graph.
So here are some questions:
- What actually size of the HNSW graph means?
- And how to estimate it if we have about 47mln vectors of size 512?