8-bit RQ quantization is not enabled by default for 1.33.9

Charlie_Chen · December 10, 2025, 12:50am

Description

I’ve upgraded to Weaviate v1.33.9, and according to the documentation

Starting with v1.33, Weaviate enables 8-bit RQ quantization by default when creating new collections to ensure efficient resource utilization and faster performance. This behavior can be changed through the DEFAULT_QUANTIZATION environment variable. Note that once enabled, quantization can’t be disabled for a collection. Default quantization only applies for the HNSW vector index type.

However, after creating a new collection using the following code:

await client.collections.create(
    name = index_name,
    properties=props,
    vector_index_config= Configure.VectorIndex.hnsw(
            ef_construction=64, 
            ef=-1, 
            vector_cache_max_objects=app_settings.WEAVIATE_VECTOR_CACHE_MAX_OBJECTS 
        )
)

I see that RQ is not enabled.

{
  "class": "PtAi_rag_3a010d31_750d_5dad_5f3c_8665f3ec8ac0_knowledge_6938bd5bd6837da2e99904a9",
  "invertedIndexConfig": {
    "bm25": {
      "b": 0.75,
      "k1": 1.2
    },
    "cleanupIntervalSeconds": 60,
    "stopwords": {
      "additions": null,
      "preset": "en",
      "removals": null
    },
    "usingBlockMaxWAND": true
  },
  "multiTenancyConfig": {
    "autoTenantActivation": false,
    "autoTenantCreation": false,
    "enabled": false
  },
  "properties": [... ],
  "replicationConfig": {
    "asyncEnabled": false,
    "deletionStrategy": "NoAutomatedResolution",
    "factor": 1
  },
  "shardingConfig": {
    "actualCount": 1,
    "actualVirtualCount": 128,
    "desiredCount": 1,
    "desiredVirtualCount": 128,
    "function": "murmur3",
    "key": "_id",
    "strategy": "hash",
    "virtualPerPhysical": 128
  },
  "vectorIndexConfig": {
    "bq": {
      "enabled": false
    },
    "cleanupIntervalSeconds": 300,
    "distance": "cosine",
    "dynamicEfFactor": 8,
    "dynamicEfMax": 500,
    "dynamicEfMin": 100,
    "ef": -1,
    "efConstruction": 64,
    "filterStrategy": "sweeping",
    "flatSearchCutoff": 40000,
    "maxConnections": 32,
    "multivector": {
      "aggregation": "maxSim",
      "enabled": false,
      "muvera": {
        "dprojections": 16,
        "enabled": false,
        "ksim": 4,
        "repetitions": 10
      }
    },
    "pq": {
      "bitCompression": false,
      "centroids": 256,
      "enabled": false,
      "encoder": {
        "distribution": "log-normal",
        "type": "kmeans"
      },
      "segments": 0,
      "trainingLimit": 100000
    },
    "rq": {
      "bits": 8,
      "enabled": false,
      "rescoreLimit": 20
    },
    "skip": false,
    "skipDefaultQuantization": false,
    "sq": {
      "enabled": false,
      "rescoreLimit": 20,
      "trainingLimit": 100000
    },
    "trackDefaultQuantization": false,
    "vectorCacheMaxObjects": 20000
  },
  "vectorIndexType": "hnsw",
  "vectorizer": "none"
}

Is this expected behavior, or could it be a bug?
Do I need to explicitly enable RQ, even in v1.33.9? The docs suggest it should be on by default for new collections with 1536-dimensional vectors.

Server Setup Information

Weaviate Server Version: 1.33.9
Deployment Method: k8s
Multi Node? Number of Running Nodes: Single
Client Language and Version: python 3.11.2, weaviate-client 4.16.7
Multitenancy?: false

Any additional Information

maryannc · December 10, 2025, 4:06am

Charlie_Chen:

Description

I’ve upgraded to Weaviate v1.33.9, and according to the documentation

Starting with v1.33, Weaviate enables 8-bit RQ quantization by default when creating new collections to ensure efficient resource utilization and faster performance. This behavior can be changed through the DEFAULT_QUANTIZATION environment variable. Note that once enabled, quantization can’t be disabled for a collection. Default quantization only applies for the HNSW vector index type.

However, after creating a new collection using the following code:

await client.collections.create(
    name = index_name,
    properties=props,
    vector_index_config= Configure.VectorIndex.hnsw(
            ef_construction=64, 
            ef=-1, 
            vector_cache_max_objects=app_settings.WEAVIATE_VECTOR_CACHE_MAX_OBJECTS 
        )
)

I see that RQ is not enabled.

{
  "class": "PtAi_rag_3a010d31_750d_5dad_5f3c_8665f3ec8ac0_knowledge_6938bd5bd6837da2e99904a9",
  "invertedIndexConfig": {
    "bm25": {
      "b": 0.75,
      "k1": 1.2
    },
    "cleanupIntervalSeconds": 60,
    "stopwords": {
      "additions": null,
      "preset": "en",
      "removals": null
    },
    "usingBlockMaxWAND": true
  },
  "multiTenancyConfig": {
    "autoTenantActivation": false,
    "autoTenantCreation": false,
    "enabled": false
  },
  "properties": [... ],
  "replicationConfig": {
    "asyncEnabled": false,
    "deletionStrategy": "NoAutomatedResolution",
    "factor": 1
  },
  "shardingConfig": {
    "actualCount": 1,
    "actualVirtualCount": 128,
    "desiredCount": 1,
    "desiredVirtualCount": 128,
    "function": "murmur3",
    "key": "_id",
    "strategy": "hash",
    "virtualPerPhysical": 128
  },
  "vectorIndexConfig": {
    "bq": {
      "enabled": false
    },
    "cleanupIntervalSeconds": 300,
    "distance": "cosine",
    "dynamicEfFactor": 8,
    "dynamicEfMax": 500,
    "dynamicEfMin": 100,
    "ef": -1,
    "efConstruction": 64,
    "filterStrategy": "sweeping",
    "flatSearchCutoff": 40000,
    "maxConnections": 32,
    "multivector": {
      "aggregation": "maxSim",
      "enabled": false,
      "muvera": {
        "dprojections": 16,
        "enabled": false,
        "ksim": 4,
        "repetitions": 10
      }
    },
    "pq": {
      "bitCompression": false,
      "centroids": 256,
      "enabled": false,
      "encoder": {
        "distribution": "log-normal",
        "type": "kmeans"
      },
      "segments": 0,
      "trainingLimit": 100000
    },
    "rq": {
      "bits": 8,
      "enabled": false,
      "rescoreLimit": 20
    },
    "skip": false,
    "skipDefaultQuantization": false,
    "sq": {
      "enabled": false,
      "rescoreLimit": 20,
      "trainingLimit": 100000
    },
    "trackDefaultQuantization": false,
    "vectorCacheMaxObjects": 20000
  },
  "vectorIndexType": "hnsw",
  "vectorizer": "none"
}

Is this expected behavior, or could it be a bug?
Do I need to explicitly enable RQ, even in v1.33.9? The docs suggest it should be on by default for new collections with 1536-dimensional vectors.

Server Setup Information

Weaviate Server Version: 1.33.9
Deployment Method: k8s
Multi Node? Number of Running Nodes: Single
Client Language and Version: python 3.11.2, weaviate-client 4.16.7
Multitenancy?: false

Hi @Charlie_Chen !

Good Day!

Welcome to Weaviate Community!

Could you check if DEFAULT_QUANTIZATION is not set or disabled on your environment? If this environment variable is disabled then RQ will not be enabled. See here

Charlie_Chen · December 10, 2025, 4:39am

I have checked my environment variable. I didn’t set DEFAULT_QUANTIZATION.

DudaNogueira · December 10, 2025, 4:56pm

hi @Charlie_Chen !!

Thanks for pointing it out!!

That info box is not well worded. In fact, It will only apply a default quantization when the setting DEFAULT_QUANTIZATION is set to one of the values mentioned. By default it will fallback to none

I have created a PR so we can improve that section.

Once again, thanks!

Topic		Replies	Views
[Question] Quantized Vectors in Weaviate Support technical	2	396	January 28, 2025
Weaviate shard in readonly during quantization 8bit Support bug , technical	1	58	January 22, 2026
Why enabled PQ significantly impacted recall. (version 1.23.7) Support	4	579	March 1, 2024
Configuring PQ compression in a collection Support	7	727	February 29, 2024
How to determine the optimal number of segments in PQ to reduce requests latency (search)? Support python , technical	4	419	December 30, 2024

8-bit RQ quantization is not enabled by default for 1.33.9

Description

Server Setup Information

Any additional Information

Related topics