No change in vector size after turning Product Quantization on

klem · January 31, 2024, 4:49pm

Hello! I was curious to try out how Product Quantization work. To embed data I use gtr-t5-large model, which creates 768-dimensional vectors. My database stores around 2k of vectors.

My python code to turn PQ on is following:

client.schema.update_config(
    "Document",
    {
        "vectorIndexConfig": {
            "pq": {
                "enabled": True, 
                "trainingLimit": 100000, 
                "segments": 96
            }
        }
    },
)

where Document is my class name.

I’ve got the following confirmation in docker logs:

semantic_search_analysis-weaviate-1  | {"action":"lsm_compaction","class":"Document","index":"document","level":"warning","msg":"compaction halted due to shard READONLY status","path":"data/document/BPZRlHodv3mc/lsm","shard":"BPZRlHodv3mc","time":"2024-01-31T16:27:24Z"}
semantic_search_analysis-weaviate-1  | {"action":"compress","level":"info","msg":"switching to compressed vectors","time":"2024-01-31T16:27:24Z"}
semantic_search_analysis-weaviate-1  | {"action":"compress","level":"info","msg":"vector compression complete","time":"2024-01-31T16:27:26Z"}

After turning PQ on I can see slightly difference in returning distances, but that’s all. Docker memory usage didn’t change and vector size is the same as it was before vector compression (768).

To find out vector size I apply the code:

top_responses = (
    client.query.get("Document", properties=["content", "source"])
    .with_near_text({"concepts": ["How to use gpu?"]})
    .with_additional(["distance", "vector"])
    .with_limit(5)
    .do()
)

vector = top_responses["data"]["Get"]["Document"][0]["_additional"]["vector"]
print(len(vector))

My questions are:

Shouldn’t I see any difference in vectors shape? I supposed that original vectors should be replaced with their compressed ones, but I might miss something.
What changes I am supposed to observe with Product Quantization? Where I should look for it?

I would appreciate your support!

DudaNogueira · January 31, 2024, 7:57pm

Hi @klem ! That’s an interesting question.

I will need to check internally on what exactly we can expect to happen.

Fow now I know that you can monitor prometheus metrics for segments info:

I will ask internally and get back to you.

Thanks!

trengrj · February 1, 2024, 10:36am

Hi @klem, it looks like you enabled product quantization correctly.

With Product Quantization enabled Weaviate still stores the original vectors on disk. The quantized vectors are stored in memory and used for distance calculations in the HNSW graph. Once the final candidates (in your case 5) are selected, Weaviate will return the entire object data (fields such as content and the original vector) from disk. Because of this, things will seem to work as before with the only real user-facing change being recall and distance calculations changing as the vectors are now smaller / quantized.

In terms of why you do not see much memory reduction. As you have 2k vectors the total memory stored by these originally would be 2000 * 768 * 4 bytes = ~6.1 megabytes. It can be hard to see the drop with a small amount of vector such as this due to noise from garbage collection timing. With a large vector dataset you will be able to see the memory drop particularly on the go_memstats_heap_inuse_bytes metric (exposed in Prometheus).

klem · February 5, 2024, 7:44am

Cool, your response is very clear. Thank you!

Topic		Replies	Views
[Question] Quantized Vectors in Weaviate Support technical	2	163	January 28, 2025
How to determine the optimal number of segments in PQ to reduce requests latency (search)? Support python , technical	4	182	December 30, 2024
Used storage space of binary quantized vector Support wcs , technical	3	157	June 13, 2025
Why enabled PQ significantly impacted recall. (version 1.23.7) Support	4	261	March 1, 2024
Binary Quantization Vector Not Found in Collection Support developer-experience , technical	3	176	November 12, 2024

No change in vector size after turning Product Quantization on

Related topics