Compress command cannot be executed before inserting some data

Evo · January 10, 2024, 12:15pm

I am trying to enable Product Quantization (PQ) on an index but getting an error.

Output of api call /v1/nodes?output=verbose:

{
  "nodes": [
    {
      "batchStats": {
        "queueLength": 0,
        "ratePerSecond": 72
      },
      "gitHash": "f381d44",
      "name": "weaviate-0",
      "shards": [
        {
          "class": "Contents",
          "name": "somename",
          "objectCount": 940370,
          "vectorIndexingStatus": "READY",
          "vectorQueueLength": 0
        }
      ],
      "status": "HEALTHY",
      "version": "1.22.5"
    }
  ]
}

Output of api call /v1/schema:

{
    "classes": [
        {
            "class": "Contents",
            "moduleConfig": {
                "text2vec-transformers": {
                    "poolingStrategy": "masked_mean",
                    "skip": false,
                    "vectorizeClassName": false,
                    "vectorizePropertyName": false
                }
            },
            "properties": [
                {
                    "dataType": [
                        "text"
                    ],
                    "indexFilterable": false,
                    "indexSearchable": true,
                    "moduleConfig": {
                        "text2vec-transformers": {
                            "skip": false,
                            "vectorizePropertyName": false
                        }
                    },
                    "name": "chunk",
                    "tokenization": "word"
                }
            ],
            "vectorIndexConfig": {
                "skip": false,
                "cleanupIntervalSeconds": 300,
                "maxConnections": 64,
                "efConstruction": 128,
                "ef": -1,
                "dynamicEfMin": 100,
                "dynamicEfMax": 500,
                "dynamicEfFactor": 8,
                "vectorCacheMaxObjects": 1000000000000,
                "flatSearchCutoff": 40000,
                "distance": "cosine",
                "pq": {
                    "enabled": false,
                    "bitCompression": false,
                    "segments": 96,
                    "centroids": 256,
                    "trainingLimit": 150000,
                    "encoder": {
                        "type": "kmeans",
                        "distribution": "log-normal"
                    }
                }
            },
            "vectorIndexType": "hnsw",
            "vectorizer": "text2vec-transformers"
        }
    ]
}

Output of /v1/objects/Contents/some-uuid?include=vector:

{
  "class": "Contents",
  "creationTimeUnix": 1704722038368,
  "id": "some-uuid",
  "lastUpdateTimeUnix": 1704722038368,
  "properties": {
    "chunk": "some text"
  },
  "vector": [
    -0.015812416,
    ...
    -0.06696027
  ],
  "vectorWeights": null
}

So it seems that the index contains data. But when changing the schema, setting vectorIndexConfig.pq.enabled to true I get this error message:
‘Compress command cannot be executed before inserting some data. Please, insert your data first.’

This error seems to come from here: weaviate/adapters/repos/db/vector/hnsw/compress.go at f51d56a2e8df6391a23af413628c29920835f60a · weaviate/weaviate · GitHub

This happens on a (pre-)production environment, on my development environment PQ was enabled without problems.
Only difference is that in development vectors were created by text2vec-transformers.
In production vectors were uploaded via gRPC, using the vector parameter.

I can’t upgrade to a newer Weaviate version since this running on Kubernetes, the Weaviate helm chart currently pins version to 1.22.5.

Any idea what I’m doing wrong or what may be causing this?

DudaNogueira · January 15, 2024, 6:01pm

Hi!

We have introduced Auto PQ in Weaviate 1.23 that should help on this process.

In order to work properly, without using Auto PQ (that will take care of this for you) it is advised to import a number of objects (recommended a 100k objects), enable autopq, wait the training steps, and then start the import of the remaining objects.

Can you describe the import steps you are doing?

Thanks!

Evo · January 26, 2024, 2:04pm

Thanks for your reply.

Looks like there are a lot of bug fixes present in weaviate versions since 1.22.5.
I will try enabling PQ again as soon as the Helm chart is updated to a newer release.

DudaNogueira · January 27, 2024, 1:31pm

Hi! No need to wait.

You can change the image tag in the values.yml of the helm chart and try newer versions already

Let me know if you need help on that.

Topic		Replies	Views
No change in vector size after turning Product Quantization on Support	3	341	February 5, 2024
Configuring PQ compression in a collection Support	7	348	February 29, 2024
Unable to add PQ to an existing collection/class Support	5	243	April 21, 2024
cluster performance or compression Support	1	288	January 19, 2024
[Question] Normalizing vectors Support	2	151	April 25, 2024

Compress command cannot be executed before inserting some data

Related topics