Help Needed: Resolving WeaviateQueryError with Nil or Zero-Length Vector at docID 715

puj · April 1, 2024, 11:22pm

Description

I am experiencing a WeaviateQueryError related to a nil or zero-length vector at docID 715 during a vector search query in Weaviate. This issue is preventing successful queries within my menuitemembeddings index. The error specifically mentions a failure with a message about the vector search at index menuitemembeddings, indicating

a “nil or zero-length vector at docID 715”.

Full Log Output

_InactiveRpcError Traceback (most recent call last)
File ~/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/weaviate/collections/grpc/query.py:609, in _QueryGRPC.__call(self, request)
608 res: search_get_pb2.SearchReply # According to PEP-0526
→ 609 res, _ = self._connection.grpc_stub.Search.with_call(
610 request,
611 metadata=self._connection.grpc_headers(),
612 timeout=self._connection.timeout_config.query,
613 )
615 return res

File ~/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/grpc/_channel.py:1193, in _UnaryUnaryMultiCallable.with_call(self, request, timeout, metadata, credentials, wait_for_ready, compression)
1187 (
1188 state,
1189 call,
1190 ) = self._blocking(
1191 request, timeout, metadata, credentials, wait_for_ready, compression
1192 )
→ 1193 return _end_unary_response_blocking(state, call, True, None)

File ~/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/grpc/_channel.py:1005, in _end_unary_response_blocking(state, call, with_call, deadline)
1004 else:
→ 1005 raise _InactiveRpcError(state)

_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
…
615 return res
617 except grpc.RpcError as e:
→ 618 raise WeaviateQueryError(e.details(), “GRPC search”)

WeaviateQueryError: Query call with protocol GRPC search failed with message explorer: get class: vector search: object vector search at index menuitemembeddings: shard menuitemembeddings_fkq12e2IPcaZ: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 715.

I’m seeking guidance on how to diagnose and resolve this issue, particularly how to investigate the problematic docID 715 and strategies for cleansing or recovering the database to avoid similar errors in the future.

Server Setup Information

Weaviate Server Version: cr.weaviate.io/semitechnologies/weaviate:1.24.6
Deployment Method: docker
Multi Node? N/A
Client Language and Version: Python 3.9.5

Any additional Information

I’m working in a local Docker setup with a “bring your own vectors” configuration using multi-vectors. The configuration involves various named vectors, including “menu_item_embedding”, “name_embedding”, and “description_embedding”. Here is a snippet of my configuration:

vectorizer_config=[
    wvcc.Configure.NamedVectors.none(name="menu_item_embedding"),
    wvcc.Configure.NamedVectors.none(name="name_embedding"),
    wvcc.Configure.NamedVectors.none(name="description_embedding"),
],

It also seems like these aren’t the only docIDs which suffer from this issue.
The dimensionality of these particular entries (and surrounding entries) for all three vectors seem to be 1024.

What does the error actually mean? Is it related to how the vector is stored or how the distance is calculated?
How can I can investigate further and cleanse my index of these probematic vectors?
Hints to detect this prior to insertion would be helpful in the future

Thank you if you read this far

DudaNogueira · April 5, 2024, 7:04pm

Hi! Do you see any outstanding logs in the server side?

Also, how big is this dataset?

Did you have any issues in ingestion? This could be a corrupted index.

Have you tried reindexing?

puj · April 5, 2024, 10:40pm

Hi @DudaNogueira
Thank you for taking the time to respond! Sorry if my answers are inaccurate or a bit long-winded. I’m still quite new to this tech.

Hi! Do you see any outstanding logs in the server side?

It’s hard for me to say for certain, but nothing at the time of querying.
Occasionally I have some WSL2 issues and need to kill docker etc… Perhaps this is a clue in the logs to some indexing error?

{"action":"lsm_recover_from_active_wal","class":"MenuItemEmbeddings","index":"menuitemembeddings","level":"warning","msg":"active write-ahead-log found. Did weaviate crash prior to this? Trying to recover...","path":"/var/lib/weaviate/menuitemembeddings/WSp0ViJvJhT2/lsm/property_menu_category_id_searchable/segment-1712350772127760700","shard":"WSp0ViJvJhT2","time":"2024-04-05T22:06:09Z"}
{"action":"lsm_recover_from_active_wal","class":"MenuItemEmbeddings","index":"menuitemembeddings","level":"warning","msg":"active write-ahead-log found. Did weaviate crash prior to this? Trying to recover...","path":"/var/lib/weaviate/menuitemembeddings/WSp0ViJvJhT2/lsm/property_menu_item_id/segment-1712350783095419100","shard":"WSp0ViJvJhT2","time":"2024-04-05T22:06:09Z"}
{"action":"lsm_recover_from_active_wal","class":"MenuItemEmbeddings","index":"menuitemembeddings","level":"warning","msg":"active write-ahead-log found. Did weaviate crash prior to this? Trying to recover...","path":"/var/lib/weaviate/menuitemembeddings/WSp0ViJvJhT2/lsm/property_ref_product_id_searchable/segment-1712350783062882400","shard":"WSp0ViJvJhT2","time":"2024-04-05T22:06:09Z"}
{"action":"lsm_recover_from_active_wal","class":"MenuItemEmbeddings","index":"menuitemembeddings","level":"warning","msg":"active write-ahead-log found. Did weaviate crash prior to this? Trying to recover...","path":"/var/lib/weaviate/menuitemembeddings/WSp0ViJvJhT2/lsm/property_data/segment-1712350793589253900","shard":"WSp0ViJvJhT2","time":"2024-04-05T22:06:09Z"}
{"action":"lsm_recover_from_active_wal","class":"MenuItemEmbeddings","index":"menuitemembeddings","level":"warning","msg":"active write-ahead-log found. Did weaviate crash prior to this? Trying to recover...","path":"/var/lib/weaviate/menuitemembeddings/WSp0ViJvJhT2/lsm/property_menu_product_id_searchable/segment-1712350783805691800","shard":"WSp0ViJvJhT2","time":"2024-04-05T22:06:09Z"}
{"action":"lsm_recover_from_active_wal","class":"MenuItemEmbeddings","index":"menuitemembeddings","level":"warning","msg":"active write-ahead-log found. Did weaviate crash prior to this? Trying to recover...","path":"/var/lib/weaviate/menuitemembeddings/WSp0ViJvJhT2/lsm/property_menu_item_id_searchable/segment-1712350783658625000","shard":"WSp0ViJvJhT2","time":"2024-04-05T22:06:09Z"}
{"action":"lsm_recover_from_active_wal","class":"MenuItemEmbeddings","index":"menuitemembeddings","level":"warning","msg":"active write-ahead-log found. Did weaviate crash prior to this? Trying to recover...","path":"/var/lib/weaviate/menuitemembeddings/WSp0ViJvJhT2/lsm/property_data_searchable/segment-1712350774067835900","shard":"WSp0ViJvJhT2","time":"2024-04-05T22:06:09Z"}

Also, how big is this dataset?

~320k entries
~383mb on disk

Did you have any issues in ingestion? This could be a corrupted index.

I’d like to know more about this. How could I potentially identify a corrupt index more specifically?

Have you tried reindexing?

Both times I’ve tried to index, the process has needed to be restarted. Here is my rough code for skipping forward to the latest index item.

What strikes me as odd is the particularly low docID 715. Instinctively, I would assume this to be some sort of internal sequence number and then a failure at the 715th document in the indexing process would make sense to me.

# Perform insertion in batches
for i in tqdm(range(0, len(import_data), BATCH_SIZE)):
    batch = import_data[i:i+BATCH_SIZE]
    menu_item_embeddings = []


    for data in batch:

        # Check if the menu item already exists in the collection
        exists = menu_item_embeddings_collection.query.fetch_objects(
            filters=wvc.query.Filter.by_property("menu_item_id").equal(data["id"])
        )

        if len(exists.objects) > 0:
            skip_count += 1
            continue

        # Transform the existing object into the Weaviate format
        weaviate_menu_item = wvc.data.DataObject(
            properties={
                "menu_item_id": data["id"],
                "menu_product_id": data["menu_product_id"],
                "menu_id": data["menu_id"],
                "menu_category_id": data["menu_category_id"],
                "ref_product_id": data["ref_product_id"],
                "data": data["data"],
            },
            vector={
                "name_embedding": from_bytes_to_list(data["name_embedding"]),
                "description_embedding": from_bytes_to_list(data["description_embedding"]),
                "menu_item_embedding": from_bytes_to_list(data["menu_item_embedding"])
            }
        )

        # Ensure the vectors are non-empty
        # TODO: Could become a sanity check function in the future
        if len(weaviate_menu_item.vector["name_embedding"]) == 0:
            continue
        if len(weaviate_menu_item.vector["description_embedding"]) == 0:
            continue
        if len(weaviate_menu_item.vector["menu_item_embedding"]) == 0:
            continue

        # If everything is okay, stage this for insertion
        menu_item_embeddings.append(weaviate_menu_item)



    if len(menu_item_embeddings) == 0:
        continue

    print(f'Skipped batch {skip_count}')
    skip_count = 0

    try:
        menu_item_embeddings_collection.data.insert_many(menu_item_embeddings)    # This uses batching under the hood
    except Exception as e:
        print(f'Error: {e}')
        print([x.vector['name_embedding'] for x in menu_item_embeddings] )

Additional notes

I was attempting to verify the dimensionality of the vectors in the sequential range around the “corrupt” vector, but I couldn’t find any clues

import json
from utils.menu_utils import MenuUtils
results = menu_item_embeddings_collection.query.fetch_objects(limit=722, include_vector=True)

# Find and print docs at index 714, 715 and 716
for i in range(712, 722):
    print(f'Index: {i}')

    if(i == 716):
        print(f'  Name: {MenuUtils.get_name(json_obj)}')
        print(f'  Vector: {len(name_embedding)}')
        continue
    json_obj = json.loads(results.objects[i].properties['data'])

    name_embedding = results.objects[i].vector['menu_item_embedding']
    print(f'Name: {MenuUtils.get_name(json_obj)}')
    print(f'Vector: {len(name_embedding)}')

This is the only code I found which could seem to cause this error and I couldn’t find any vectors of 0-length in my collection

nicholasamiller · April 6, 2024, 8:12am

I got this error also, same version of Weaviate.
It happens every time I restart the docker container.
Maybe something to do with shutdown corrupts the index in the docker volume.

puj · April 6, 2024, 10:15am

Thanks for your addition @nicholasamiller

Update

I initially thought there was something wrong with how I was building the vectors before inserting them with a None vectorization config wvc.config.Configure.Vectorizer.none()

However, my wvc.config.Configure.NamedVectors.text2vec_openai collection just broke now as well with a similar error.

Collection Setup

Here is the setup for that collection.

client.collections.create(
    name=COLLECTION_NAME,
    description="Collection of menu items with embeddings",
    properties=[
        wvc.config.Property(name="menu_item_id", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="menu_product_id", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="menu_id", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="menu_category_id", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="ref_product_id", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="data", data_type=wvc.config.DataType.TEXT),

    ],
    vectorizer_config=[
                    wvc.config.Configure.NamedVectors.text2vec_openai(
                name="menu_item_embedding", source_properties=["menu_item_text"]
            ),
                    wvc.config.Configure.NamedVectors.text2vec_openai(
                name="name_embedding", source_properties=["name"]
            ),
                    wvc.config.Configure.NamedVectors.text2vec_openai(
                name="description_embedding", source_properties=["description"]
            )
    ],
)

Error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/flask/app.py", line 1488, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/flask/app.py", line 1466, in wsgi_app
    response = self.handle_exception(e)
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/flask_cors/extension.py", line 176, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/flask/app.py", line 1463, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/flask/app.py", line 872, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/flask_cors/extension.py", line 176, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/flask/app.py", line 870, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/flask/app.py", line 855, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/server.py", line 76, in search_menu_items
    response = menu_item_embeddings_collection.query.near_text(
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/weaviate/collections/queries/near_text/query.py", line 90, in near_text
    res = self._query.near_text(
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/weaviate/collections/grpc/query.py", line 418, in near_text
    return self.__call(request)
  File "/home/van/Pgammin/Qopla/qMenuAnalysis/.venv/lib/python3.9/site-packages/weaviate/collections/grpc/query.py", line 618, in __call
    raise WeaviateQueryError(e.details(), "GRPC search")  # pyright: ignore
weaviate.exceptions.WeaviateQueryError: Query call with protocol GRPC search failed with message explorer: get class: vector search: object vector search at index openai_menuitemembeddings: shard openai_menuitemembeddings_v7FoIMDvjzTp: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 605.

Additional Notes

The docID is consistent between docker contain restarts
No docker container logs are emitted at the time of triggering the query

Questions

Is there any tooling for me to check the health of my index?
Is it possible to directly inspect a vector based on the docId?
Is there any recommendation you have the filesystem side to create a restore point? (my DB doesn’t change very often)

puj · April 7, 2024, 7:44pm

This has now happened with a 3rd collection (separate docker instance)

Traceback (most recent call last):
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/flask/app.py", line 1488, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/flask/app.py", line 1466, in wsgi_app
    response = self.handle_exception(e)
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/flask_cors/extension.py", line 176, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/flask/app.py", line 1463, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/flask/app.py", line 872, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/flask_cors/extension.py", line 176, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/flask/app.py", line 870, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/flask/app.py", line 855, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/home/van/Pgammin/ReGPT/backend/server.py", line 137, in search_conversations
    response = conversation_collection.query.near_text(
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/weaviate/collections/queries/near_text/query.py", line 90, in near_text
    res = self._query.near_text(
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/weaviate/collections/grpc/query.py", line 418, in near_text
    return self.__call(request)
  File "/home/van/Pgammin/ReGPT/.venv/lib/python3.9/site-packages/weaviate/collections/grpc/query.py", line 618, in __call
    raise WeaviateQueryError(e.details(), "GRPC search")  # pyright: ignore
weaviate.exceptions.WeaviateQueryError: Query call with protocol GRPC search failed with message explorer: get class: vector search: object vector search at index conversations: shard conversations_gMerlVoh5Zup: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 252.

This is after restarting my PC without explicitly shutting down WSL2 or Docker.

Questions

Can I diagnose/heal my index somehow without re-inserting all the data?
Do I need to make periodic backups?

Right now, I’m concerned to put more data into my Weaviate instances. Also, I’m especially hesitate to modify data that will be lost if I need to “reindex” (which also cost when calculating the embeddings)

nicholasamiller · April 7, 2024, 11:03pm

I tried to reproduce this using the jeopardy sample dataset, adding named vectors. Could not reproduce. I will try to reproduce the problem on a Linux machine: issue could be Docker desktop on Windows.

DudaNogueira · April 12, 2024, 1:42pm

Hi @puj !

I was able to create the class you provided, I notice that you may want to use deterministic IDs, as you added some logic to avoid inserting the object twice

Also I noticed you are leaving some properties to AUTO SCHEMA to create while on import. My suggestion is to create all properties before hand, specially the ones you are using for named vectors.

But I was not able to reproduce this.

Can you create a minimum reproducible example in a python notebook?

mastermind · April 13, 2024, 8:56pm

I have same issue.

Here is error i get:
weaviate.exceptions.WeaviateQueryError: Query call with protocol GRPC search failed with message explorer: get class: vector search: object vector search at index subclassconfiguration: shard subclassconfiguration_uOWO1ZhUXD9g: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1830.

This hapens if docker-compose is restarted via docker-compose down > docker-compose up.

Here is my workflow from start to finish:

First i create db schema:

Than i import data in to database.

Than after schema is created
before i restart both methods work. I can search both using
collection.generate.near_text
and
collection.query.near_text

And it works it returns results. In case of vector search it returns records. In case of generative search it returns message from gpt aditionaly i can fetch references related to found records.

Than i do docker-compose down => docker-compose up -d
And when i try to seacrh using vector i get error like: distance between entrypoint and query node: got a nil or zero-length vector at docID 1830

But i still can search normaly using collection.query.fetch_objects
And there i can include vecctor and it will be included in response. So does exists.

Also i have set up a script to check for null or lenght < 0 script it found no invalid vectors.

But if i try to search i get same error discussed above.

Also i could not find any way in documentation to reindex existing records in db.
Is there no staight forward way to manualy triger reindexing of existing records in db. It seams that the only way to use weaviate is to never stop docker-compose. And if it does stop for some reason. The only way to make it back working again. Is fully recreating it from initial point. So create collections. Than insert data in to them. And than it is posible to search using vectors. That is until server crashes for one reason or another. Than all this process have to be one again. Is it suposed to be this way? What is sugested workflow for shuting down docker-compose and than turning it back on. For instance i can not create collections localy and than transfer that data to server. In this case my vectors get corupte and i get error above. So the only way woudl be to run docker-compose on remote server and than generate vectors there so it works. Unitl server restart.

If anything else need to be provided please let me know.

nicholasamiller · April 15, 2024, 6:41am

Problem “solved” when I changed by schema to index on one vector only: no multiple named vectors.
So without really having any idea, I would speculate, wildly, that there is a bug to do with some state, maybe IDs, that are created when adding data objects with multiple named vectors. This state may not be properly serialized to disk. So when the container restarts, it is not loaded into memory, and all the queries fail.
Multiple named vectors are recent feature.
Anyway, I can adjust to make do with a single vector for the moment.
I can’t reproduce without sharing all my data.
My data has 14380 objects at the moment. About 400Mb.
No cross references.
Provided my own vectors.
Problem occurred on both Ubuntu and Windows, anytime the weaviate container restarted.

smwitkowski · April 18, 2024, 2:42am

I will try to reproduce the problem on a Linux machine: issue could be Docker desktop on Windows.

I am facing this when using an embedded instance on MacOS as well.

DudaNogueira · April 18, 2024, 6:41pm

Edit: version 1.24.11 fix this bug!

Hi there dear friends!!!

Welcome to our community @smwitkowski !!

Our team is already aware of this and were able to reproduce. Here is the GH issue:

github.com/weaviate/weaviate

Multivector: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1115

opened 11:31AM - 17 Apr 24 UTC

closed 10:47AM - 07 May 24 UTC

rthiiyer82

bug

### How to reproduce this bug? 1. I have used docker to start weaviate version …1.24.8 2. Single node cluster 3. Create a multi vector schema or collection. 4. Add objects . I have added around 10k objects and could still reproduce the issue. 5. Perform `near_text` or `near_vector` query. 6. Restart the docker container 7. Run the query again Code snippet below ``` import weaviate import weaviate.classes.config as wvc client = weaviate.connect_to_local( port=8080, grpc_port=50051, headers={ "X-OpenAI-Api-Key": "" # Replace with your inference API key } ) if (client.collections.exists("NamedVector")): # delete collection "Article" - THIS WILL DELETE THE COLLECTION AND ALL ITS DATA client.collections.delete("NamedVector") collection = client.collections.create( name="NamedVector", description="Collection of menu items with embeddings", properties=[ wvc.Property(name="menu_item_id", data_type=wvc.DataType.TEXT), wvc.Property(name="description", data_type=wvc.DataType.TEXT), wvc.Property(name="name", data_type=wvc.DataType.TEXT), ], vectorizer_config=[ wvc.Configure.NamedVectors.text2vec_openai( name="name_embedding", source_properties=["name"] ), wvc.Configure.NamedVectors.text2vec_openai( name="description_embedding", source_properties=["description"] ) ], ) # BatchImportWithNamedVectors record = 10000 data_rows = [{ "menu_item_id": f"Object {i+1}", "description": f"Body {i+1}", "name": f"Body {i+1}", } for i in range(record)] name_vectors = [[0.12] * 1536 for _ in range(record)] description_vectors = [[0.34] * 1536 for _ in range(record)] menu_item_id_vectors = [[0.38] * 1536 for _ in range(record)] collection = client.collections.get("NamedVector") # highlight-start with collection.batch.dynamic() as batch: for i, data_row in enumerate(data_rows): batch.add_object( properties=data_row, vector={ "name": name_vectors[i], "description": description_vectors[i], "menu_item_id": menu_item_id_vectors[i], } ) # highlight-end # END BatchImportWithNamedVectors from weaviate.classes.query import MetadataQuery reviews = client.collections.get("NamedVector") response = reviews.query.near_text( query="Health and healthcare products", target_vector="description_embedding", # Specify the target vector for named vector collections return_metadata=MetadataQuery(distance=True) ) for o in response.objects: print(o.properties) print(o.metadata.distance) from weaviate.classes.query import MetadataQuery query_vector = [1.0]* 1536 jeopardy = client.collections.get("NamedVector") response = jeopardy.query.near_vector( near_vector=query_vector, # your query vector goes here target_vector="description_embedding", return_metadata=MetadataQuery(distance=True) ) for o in response.objects: print(o.properties) print(o.metadata.distance) ``` ### What is the expected behavior? No error message should be displayed and the query should work without any issues even after docker is restarted. ### What is the actual behavior? Full stack trace below ``` { "name": "WeaviateQueryError", "message": "Query call with protocol GRPC search failed with message explorer: get class: vector search: object vector search at index namedvector: shard namedvector_C6ezH9Q2rHxA: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1115.", "stack": "--------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) File ~/Downloads/projectWorkspace/weaviate-python-client/weaviate/collections/grpc/query.py:609, in _QueryGRPC.__call(self, request) 608 res: search_get_pb2.SearchReply # According to PEP-0526 --> 609 res, _ = self._connection.grpc_stub.Search.with_call( 610 request, 611 metadata=self._connection.grpc_headers(), 612 timeout=self._connection.timeout_config.query, 613 ) 615 return res File ~/Downloads/projectWorkspace/pythonv4Testing/.venv/lib/python3.12/site-packages/grpc/_channel.py:1177, in _UnaryUnaryMultiCallable.with_call(self, request, timeout, metadata, credentials, wait_for_ready, compression) 1171 ( 1172 state, 1173 call, 1174 ) = self._blocking( 1175 request, timeout, metadata, credentials, wait_for_ready, compression 1176 ) -> 1177 return _end_unary_response_blocking(state, call, True, None) File ~/Downloads/projectWorkspace/pythonv4Testing/.venv/lib/python3.12/site-packages/grpc/_channel.py:1003, in _end_unary_response_blocking(state, call, with_call, deadline) 1002 else: -> 1003 raise _InactiveRpcError(state) _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: \tstatus = StatusCode.UNKNOWN \tdetails = \"explorer: get class: vector search: object vector search at index namedvector: shard namedvector_C6ezH9Q2rHxA: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1115\" \tdebug_error_string = \"UNKNOWN:Error received from peer {created_time:\"2024-04-17T16:12:32.705521+05:30\", grpc_status:2, grpc_message:\"explorer: get class: vector search: object vector search at index namedvector: shard namedvector_C6ezH9Q2rHxA: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1115\"}\" > During handling of the above exception, another exception occurred: WeaviateQueryError Traceback (most recent call last) /Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb Cell 8 line 5 <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=2'>3</a> query_vector = [1.0]* 1536 <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=3'>4</a> jeopardy = client.collections.get(\"NamedVector\") ----> <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=4'>5</a> response = jeopardy.query.near_vector( <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=5'>6</a> near_vector=query_vector, # your query vector goes here <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=6'>7</a> target_vector=\"description_embedding\", <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=7'>8</a> return_metadata=MetadataQuery(distance=True) <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=8'>9</a> ) <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=10'>11</a> for o in response.objects: <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=11'>12</a> print(o.properties) File ~/Downloads/projectWorkspace/weaviate-python-client/weaviate/collections/queries/near_vector/query.py:80, in _NearVectorQuery.near_vector(self, near_vector, certainty, distance, limit, offset, auto_limit, filters, group_by, rerank, target_vector, include_vector, return_metadata, return_properties, return_references) 20 def near_vector( 21 self, 22 near_vector: List[float], (...) 36 return_references: Optional[ReturnReferences[TReferences]] = None, 37 ) -> QueryNearMediaReturnType[Properties, References, TProperties, TReferences]: 38 \"\"\"Search for objects by vector in this collection using and vector-based similarity search. 39 40 See the [docs](https://weaviate.io/developers/weaviate/search/similarity) for a more detailed explanation. (...) 78 If the request to the Weaviate server fails. 79 \"\"\" ---> 80 res = self._query.near_vector( 81 near_vector=near_vector, 82 certainty=certainty, 83 distance=distance, 84 limit=limit, 85 offset=offset, 86 autocut=auto_limit, 87 filters=filters, 88 group_by=_GroupBy.from_input(group_by), 89 rerank=rerank, 90 target_vector=target_vector, 91 return_metadata=self._parse_return_metadata(return_metadata, include_vector), 92 return_properties=self._parse_return_properties(return_properties), 93 return_references=self._parse_return_references(return_references), 94 ) 95 return self._result_to_query_or_groupby_return( 96 res, 97 _QueryOptions.from_input( (...) 107 return_references, 108 ) File ~/Downloads/projectWorkspace/weaviate-python-client/weaviate/collections/grpc/query.py:296, in _QueryGRPC.near_vector(self, near_vector, certainty, distance, limit, offset, autocut, filters, group_by, generative, rerank, target_vector, return_metadata, return_properties, return_references) 275 certainty, distance = self.__parse_near_options(certainty, distance) 277 request = self.__create_request( 278 limit=limit, 279 offset=offset, (...) 293 ), 294 ) --> 296 return self.__call(request) File ~/Downloads/projectWorkspace/weaviate-python-client/weaviate/collections/grpc/query.py:618, in _QueryGRPC.__call(self, request) 615 return res 617 except grpc.RpcError as e: --> 618 raise WeaviateQueryError(e.details(), \"GRPC search\") WeaviateQueryError: Query call with protocol GRPC search failed with message explorer: get class: vector search: object vector search at index namedvector: shard namedvector_C6ezH9Q2rHxA: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1115." } ``` ### Supporting information Forum link : https://forum.weaviate.io/t/help-needed-resolving-weaviatequeryerror-with-nil-or-zero-length-vector-at-docid-715/1876/2 Also, the issue is seen only in Multi vector class. The only way for now to fix the issue is to reindex the objects. ### Server Version 1.24.8 ### Code of Conduct - [X] I have read and agree to the Weaviate's [Contributor Guide](https://weaviate.io/developers/contributor-guide) and [Code of Conduct](https://weaviate.io/service/code-of-conduct)

Thank you all for reporting and being such an amazing community!

puj · April 22, 2024, 7:54am

Awesome, thanks for the update. Eagerly awaiting this fix

Thanks everyone for contributing to this report

hejq · April 29, 2024, 10:07am

I’m also getting the same issue. I have a docker-compose launched instance with a mounted volume.

I copied the volume to a remote server and launch another instance there with the volume mounted. Then I got the error.
I want to do some comparison and so I got back to my local instance and restart the container. Then I even got the same error locally.

So I’m also suspecting it related to restarting.

Anyway, glad to see it’s tracked on GH. Is there any workaround before this get fixed?

puj · April 29, 2024, 10:44am

Linking to the workaround for now

github.com/weaviate/weaviate

Multivector: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1115

opened 11:31AM - 17 Apr 24 UTC

rthiiyer82

bug

### How to reproduce this bug? 1. I have used docker to start weaviate version …1.24.8 2. Single node cluster 3. Create a multi vector schema or collection. 4. Add objects . I have added around 10k objects and could still reproduce the issue. 5. Perform `near_text` or `near_vector` query. 6. Restart the docker container 7. Run the query again Code snippet below ``` import weaviate import weaviate.classes.config as wvc client = weaviate.connect_to_local( port=8080, grpc_port=50051, headers={ "X-OpenAI-Api-Key": "" # Replace with your inference API key } ) if (client.collections.exists("NamedVector")): # delete collection "Article" - THIS WILL DELETE THE COLLECTION AND ALL ITS DATA client.collections.delete("NamedVector") collection = client.collections.create( name="NamedVector", description="Collection of menu items with embeddings", properties=[ wvc.Property(name="menu_item_id", data_type=wvc.DataType.TEXT), wvc.Property(name="description", data_type=wvc.DataType.TEXT), wvc.Property(name="name", data_type=wvc.DataType.TEXT), ], vectorizer_config=[ wvc.Configure.NamedVectors.text2vec_openai( name="name_embedding", source_properties=["name"] ), wvc.Configure.NamedVectors.text2vec_openai( name="description_embedding", source_properties=["description"] ) ], ) # BatchImportWithNamedVectors record = 10000 data_rows = [{ "menu_item_id": f"Object {i+1}", "description": f"Body {i+1}", "name": f"Body {i+1}", } for i in range(record)] name_vectors = [[0.12] * 1536 for _ in range(record)] description_vectors = [[0.34] * 1536 for _ in range(record)] menu_item_id_vectors = [[0.38] * 1536 for _ in range(record)] collection = client.collections.get("NamedVector") # highlight-start with collection.batch.dynamic() as batch: for i, data_row in enumerate(data_rows): batch.add_object( properties=data_row, vector={ "name": name_vectors[i], "description": description_vectors[i], "menu_item_id": menu_item_id_vectors[i], } ) # highlight-end # END BatchImportWithNamedVectors from weaviate.classes.query import MetadataQuery reviews = client.collections.get("NamedVector") response = reviews.query.near_text( query="Health and healthcare products", target_vector="description_embedding", # Specify the target vector for named vector collections return_metadata=MetadataQuery(distance=True) ) for o in response.objects: print(o.properties) print(o.metadata.distance) from weaviate.classes.query import MetadataQuery query_vector = [1.0]* 1536 jeopardy = client.collections.get("NamedVector") response = jeopardy.query.near_vector( near_vector=query_vector, # your query vector goes here target_vector="description_embedding", return_metadata=MetadataQuery(distance=True) ) for o in response.objects: print(o.properties) print(o.metadata.distance) ``` ### What is the expected behavior? No error message should be displayed and the query should work without any issues even after docker is restarted. ### What is the actual behavior? Full stack trace below ``` { "name": "WeaviateQueryError", "message": "Query call with protocol GRPC search failed with message explorer: get class: vector search: object vector search at index namedvector: shard namedvector_C6ezH9Q2rHxA: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1115.", "stack": "--------------------------------------------------------------------------- _InactiveRpcError Traceback (most recent call last) File ~/Downloads/projectWorkspace/weaviate-python-client/weaviate/collections/grpc/query.py:609, in _QueryGRPC.__call(self, request) 608 res: search_get_pb2.SearchReply # According to PEP-0526 --> 609 res, _ = self._connection.grpc_stub.Search.with_call( 610 request, 611 metadata=self._connection.grpc_headers(), 612 timeout=self._connection.timeout_config.query, 613 ) 615 return res File ~/Downloads/projectWorkspace/pythonv4Testing/.venv/lib/python3.12/site-packages/grpc/_channel.py:1177, in _UnaryUnaryMultiCallable.with_call(self, request, timeout, metadata, credentials, wait_for_ready, compression) 1171 ( 1172 state, 1173 call, 1174 ) = self._blocking( 1175 request, timeout, metadata, credentials, wait_for_ready, compression 1176 ) -> 1177 return _end_unary_response_blocking(state, call, True, None) File ~/Downloads/projectWorkspace/pythonv4Testing/.venv/lib/python3.12/site-packages/grpc/_channel.py:1003, in _end_unary_response_blocking(state, call, with_call, deadline) 1002 else: -> 1003 raise _InactiveRpcError(state) _InactiveRpcError: <_InactiveRpcError of RPC that terminated with: \tstatus = StatusCode.UNKNOWN \tdetails = \"explorer: get class: vector search: object vector search at index namedvector: shard namedvector_C6ezH9Q2rHxA: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1115\" \tdebug_error_string = \"UNKNOWN:Error received from peer {created_time:\"2024-04-17T16:12:32.705521+05:30\", grpc_status:2, grpc_message:\"explorer: get class: vector search: object vector search at index namedvector: shard namedvector_C6ezH9Q2rHxA: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1115\"}\" > During handling of the above exception, another exception occurred: WeaviateQueryError Traceback (most recent call last) /Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb Cell 8 line 5 <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=2'>3</a> query_vector = [1.0]* 1536 <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=3'>4</a> jeopardy = client.collections.get(\"NamedVector\") ----> <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=4'>5</a> response = jeopardy.query.near_vector( <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=5'>6</a> near_vector=query_vector, # your query vector goes here <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=6'>7</a> target_vector=\"description_embedding\", <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=7'>8</a> return_metadata=MetadataQuery(distance=True) <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=8'>9</a> ) <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=10'>11</a> for o in response.objects: <a href='vscode-notebook-cell:/Users/aarthiiyer/Downloads/projectWorkspace/pythonv4Testing/alltius/multivector-search.ipynb#W6sZmlsZQ%3D%3D?line=11'>12</a> print(o.properties) File ~/Downloads/projectWorkspace/weaviate-python-client/weaviate/collections/queries/near_vector/query.py:80, in _NearVectorQuery.near_vector(self, near_vector, certainty, distance, limit, offset, auto_limit, filters, group_by, rerank, target_vector, include_vector, return_metadata, return_properties, return_references) 20 def near_vector( 21 self, 22 near_vector: List[float], (...) 36 return_references: Optional[ReturnReferences[TReferences]] = None, 37 ) -> QueryNearMediaReturnType[Properties, References, TProperties, TReferences]: 38 \"\"\"Search for objects by vector in this collection using and vector-based similarity search. 39 40 See the [docs](https://weaviate.io/developers/weaviate/search/similarity) for a more detailed explanation. (...) 78 If the request to the Weaviate server fails. 79 \"\"\" ---> 80 res = self._query.near_vector( 81 near_vector=near_vector, 82 certainty=certainty, 83 distance=distance, 84 limit=limit, 85 offset=offset, 86 autocut=auto_limit, 87 filters=filters, 88 group_by=_GroupBy.from_input(group_by), 89 rerank=rerank, 90 target_vector=target_vector, 91 return_metadata=self._parse_return_metadata(return_metadata, include_vector), 92 return_properties=self._parse_return_properties(return_properties), 93 return_references=self._parse_return_references(return_references), 94 ) 95 return self._result_to_query_or_groupby_return( 96 res, 97 _QueryOptions.from_input( (...) 107 return_references, 108 ) File ~/Downloads/projectWorkspace/weaviate-python-client/weaviate/collections/grpc/query.py:296, in _QueryGRPC.near_vector(self, near_vector, certainty, distance, limit, offset, autocut, filters, group_by, generative, rerank, target_vector, return_metadata, return_properties, return_references) 275 certainty, distance = self.__parse_near_options(certainty, distance) 277 request = self.__create_request( 278 limit=limit, 279 offset=offset, (...) 293 ), 294 ) --> 296 return self.__call(request) File ~/Downloads/projectWorkspace/weaviate-python-client/weaviate/collections/grpc/query.py:618, in _QueryGRPC.__call(self, request) 615 return res 617 except grpc.RpcError as e: --> 618 raise WeaviateQueryError(e.details(), \"GRPC search\") WeaviateQueryError: Query call with protocol GRPC search failed with message explorer: get class: vector search: object vector search at index namedvector: shard namedvector_C6ezH9Q2rHxA: vector search: knn search: distance between entrypoint and query node: got a nil or zero-length vector at docID 1115." } ``` ### Supporting information Forum link : https://forum.weaviate.io/t/help-needed-resolving-weaviatequeryerror-with-nil-or-zero-length-vector-at-docid-715/1876/2 Also, the issue is seen only in Multi vector class. The only way for now to fix the issue is to reindex the objects. ### Server Version 1.24.8 ### Code of Conduct - [X] I have read and agree to the Weaviate's [Contributor Guide](https://weaviate.io/developers/contributor-guide) and [Code of Conduct](https://weaviate.io/service/code-of-conduct)

antas-marcin · May 7, 2024, 4:11am

@puj I was able to resolve this issue (PR) Today we will release v1.24.11 that will contain this fix. I will let you know once the release is ready.

puj · May 7, 2024, 8:35am

@puj I was able to resolve this issue (PR) Today we will release v1.24.11 that will contain this fix. I will let you know once the release is ready.

How lovely! Well done
Thanks for following up on this!

Was interesting to browse your solution

antas-marcin · May 7, 2024, 10:48am

Cool and thank you for your kind words!

I just wanted to let you know that the v1.24.11 is out

rjalex · May 11, 2024, 2:14pm

A LIFE SAVER … I was in the same hell. Multiple named vectors and no default modules and nothing worked. Going from 24.6 to 24.11 fixed this.

Thanks to all of you gals and guys !!!

Topic		Replies	Views
WeaviateQueryError when using weaviate.collections.collection.Collection.query.near_text() Support wcs , python	1	451	April 26, 2024
WeaviateQueryError Support	7	708	June 5, 2024
Error Encountered in Weaviate Vector Search Support bug , python	1	847	March 25, 2024
Error during nearText call: panic occurred: ValidateParam was called without any known params present Support	22	1284	May 10, 2024
Dense search grcp error after restarting weaviate Docker Support	2	286	September 2, 2024

Help Needed: Resolving WeaviateQueryError with Nil or Zero-Length Vector at docID 715

Description

Server Setup Information

Any additional Information

Additional notes

Update

Collection Setup

Error

Additional Notes

Questions

Questions

Edit: version 1.24.11 fix this bug!

Related topics