Vector dimension mismatch

In a WCS cluster with two collections, after indexing 50000 objects and without any error, the collections turn into corrupted state and always give “vector lengths don’t match: 512 vs 1536

Server Setup Information

  • Weaviate Server Version:
  • Deployment Method: WCS
  • Multi Node? N/A
  • Client Language and Version: Java 11

The collections are initialized the following way:
class_obj = {
“class”: “”,
“description”: “”,
“vectorizer”: “text2vec-openai”,
“moduleConfig”: {
“text2vec-openai”: {
“model”: “text-embedding-3-small”,
“dimensions”: 512,
“type”: “text”,
“vectorizeClassName”: False
},
“generative-openai”: {}
},
“invertedIndexConfig”: {
“indexTimestamps”: True
},
“properties”: [
{
“name”: “nref”,
“dataType”: [“text”],
“moduleConfig”: {
“text2vec-openai”: {
“skip”: True,
“tokenization”: “field”
}
}
},
{
“name”: “texto”,
“dataType”: [“text”],
“moduleConfig”: {
“text2vec-openai”: {
“vectorizePropertyName”: False,
“tokenization”: “lowercase”
}
}
},
{
“name”: “actualizacion”,
“dataType”: [“text”],
“moduleConfig”: {
“text2vec-openai”: {
“skip”: True, # Don’t vectorize nref
“tokenization”: “field”
}
}
}
]
}

Help, please!

Thanks.

Hi @jamonge !

Welcome to our community :hugs:

We are investigating this issue. Thanks for reporting.

As soon as we get updates, we’ll let you know here.

THanks!

Hello. It happened again. Any updates? How could an index be fixed? We are trying to reindex, but no success.
Thanks.

Hi!

We have identified and fixed one issue related to that and will publish a patch with that soon.

The issue was that while specifying a different dimension and restarting the server, it was not being read back after the reboot, falling back to the default dimensions value.

Thanks!