Due to the decrease in performance of Weaviate with the use of product quantization, it is necessary to select the optimal parameters of the HNSW index for minimal loss of accuracy with higher performance compared to HNSW without compression with current (not modified parameters).
- Weaviate Server Version: 1.26.0
- Deployment Method: Docker-compose
- Standalone
Current HNSW parameters for two named vectors:
Vector A:
ef = 320, efConstruction = 320, maxConnections = 100.
Vector Length = 768.
Vector B:
ef = 480, efConstruction = 480, maxConnections = 120
Without PQ search average QPS: 145 obj/s
With PQ (segments=6) search average QPS: 75 obj/s. With other segments values QPS is not higher, and often lower. Dataset size was 5 million and training size = 100 000 - 150 000. It is expected to store 20+ million vectors.
Why does vector compression degrade performance so much? I suspect that the parameters of the HNSW need to be adjusted, but it is not yet clear how to maintain balance and how much they should change for PQ…