Configuring PQ compression in a collection

Hi, I am using Python client v4 to interact with weaviate. I have configured PQ compression using quantizer

collection.config.update(
    vector_index_config=wvc.config.Reconfigure.VectorIndex.hnsw(
        quantizer=wvc.config.Reconfigure.VectorIndex.Quantizer.pq()
    ))

In the configuration of the collection, I could find the quantizer to be configured with PQ.

vector_index_config=_VectorIndexConfigHNSW(quantizer=_PQConfig(bit_compression=False, segments=0, centroids=256, training_limit=100000, encoder=_PQEncoderConfig(type_=<PQEncoderType.KMEANS: 'kmeans'>, distribution=<PQEncoderDistribution.LOG_NORMAL: 'log-normal'>))

Is PQ configured correctly and what is the attribute ‘bit_compression’ corresponds to?

Hi @Jegadeesh !

I also couldn’t find anything in docs about bit_compression, so I asked internally.

Also, not sure why the segments is 0, but I believe that, according to here after 1.23 Weaviate will define this parameter according to the dimensions of your vectors. So after populating the collection, Weaviate should define this parameter to better suit your case.

I’ll get back to you when I get more information.

Thanks!

Hi @DudaNogueira !

I am trying to configure PQ compression in WCS (Sandbox version).
Could you tell me whether PQ compression is possible in the Sandbox version of Weaviate?

Hi @Jegadeesh !

You can use PQ in sandbox. Sandbox is in fact the very same version you have in a paid cluster and the very same you get from docker or binary. It only has limited resources (memory, cpu, storage), so don’t expect too much from it :grimacing:

Regarding bit_compression, its no longer used and will be removed in future versions.

Let me know if this helps!

Thanks!

Is there any way we could configure the parameter ‘segments’ manually using Python v4 client?

hi @Jegadeesh!

Sure thing. Here is how:

collection.config.update(
    vector_index_config=wvc.config.Reconfigure.VectorIndex.hnsw(
        quantizer=wvc.config.Reconfigure.VectorIndex.Quantizer.pq(segments=1234)
    ))

Note, your segments configuration needs to be an integer divisor or the total dimensions of your data.

If we are using manual method to enable PQ, does Weaviate automatically define and update the ‘segments’ parameter based on our vector dimension if it’s not set during PQ configuration?

Or do we need to set it manually while configuring PQ in WCS? Also, what if the segments is 0 and is it essential to set ‘segments’ to do PQ compression?

Thanks in advance!

If you specify a segment value, Weaviate will use it.

If you do not specify the segments, the AutoPQ will kick in and come up with the best segment configuration based on your data:

Check here for more info on how to manually use PQ: