Getting "no object found for doc id" when creating new objects suddenly

We are getting below error when creating new object:

WeaviateError(statusCode=500, messages=[WeaviateErrorMessage(message=put object: import into index profile: put local object: shard=“gV04uD1HB1QD”: Validate vector index for [105 35 162 129 86 191 78 64 176 10 54 87 7 235 243 174]: no object found for doc id 2241: no object for doc id, it could have been deleted, throwable=null)])

Note that we have been using Weaviate last year and it is the first time we encountered this error and could not find something related to it :frowning_face:

So any help will be great - below are the details of our setup:

Below is the /nodes API check:

{
    "nodes": [
        {
            "batchStats": {
                "queueLength": 0,
                "ratePerSecond": 0
            },
            "gitHash": "621586d",
            "name": "OBSCURED_FOR_SECURITY",
            "shards": [
                {
                    "class": "Profile",
                    "name": "gV04uD1HB1QD",
                    "objectCount": 2375
                }
            ],
            "stats": {
                "objectCount": 2375,
                "shardCount": 1
            },
            "status": "HEALTHY",
            "version": "1.21.9"
        }
    ]
}

Our DevOps team provided below repeating Weaviate logs during the start of incident:

{"action":"attach_tombstone_to_deleted_node","level":"info","msg":"found a deleted node (1485) without a tombstone, tombstone was added","node_id":1485,"time":"2024-02-06T15:11:42Z"}
{"action":"attach_tombstone_to_deleted_node","level":"info","msg":"found a deleted node (1352) without a tombstone, tombstone was added","node_id":1352,"time":"2024-02-06T15:11:42Z"}
{"action":"attach_tombstone_to_deleted_node","level":"info","msg":"found a deleted node (2245) without a tombstone, tombstone was added","node_id":2245,"time":"2024-02-06T15:11:42Z"}
{"action":"attach_tombstone_to_deleted_node","level":"info","msg":"found a deleted node (1041) without a tombstone, tombstone was added","node_id":1041,"time":"2024-02-06T15:11:42Z"}
{"action":"attach_tombstone_to_deleted_node","level":"info","msg":"found a deleted node (2806) without a tombstone, tombstone was added","node_id":2806,"time":"2024-02-06T15:11:42Z"}
{"action":"attach_tombstone_to_deleted_node","level":"info","msg":"found a deleted node (3518) without a tombstone, tombstone was added","node_id":3518,"time":"2024-02-06T15:11:42Z"}
{"action":"attach_tombstone_to_deleted_node","level":"info","msg":"found a deleted node (3276) without a tombstone, tombstone was added","node_id":3276,"time":"2024-02-06T15:11:42Z"}
{"action":"attach_tombstone_to_deleted_node","level":"info","msg":"found a deleted node (2232) without a tombstone, tombstone was added","node_id":2232,"time":"2024-02-06T15:11:42Z"}
{"action":"attach_tombstone_to_deleted_node","level":"info","msg":"found a deleted node (3479) without a tombstone, tombstone was added","node_id":3479,"time":"2024-02-06T15:11:42Z"}

Thank you :bowing_man:


Just an update - it got fixed after upgrading to v1.23.7 (latest) but we will really appreciate the reason why it got broke all of a sudden without any changes on our side :frowning:

Also, after the upgrade the /nodes results is now:

{
    "nodes": [
        {
            "batchStats": {
                "queueLength": 0,
                "ratePerSecond": 0
            },
            "gitHash": "d5c2694",
            "name": "OBSCURED_FOR_SECURITY",
            "shards": null,
            "stats": {
                "objectCount": 2380,
                "shardCount": 1
            },
            "status": "HEALTHY",
            "version": "1.23.7"
        }
    ]
}

Hi @junbetterway !

Glad the new version fixed it. I will ask internally for someone with more visibility on this kind of issues.

Thanks!

Thank you - yeah but the management is worried a bit in case this recur so more of root cause analysis on our end. So appreciate for any findings.

Lastly, just let me know if what needs to be checked (e.g., needed logs). Thanks

Hi @junbetterway,

I did a review of the source code at version 1.21.9 and I found that in that version during object insertion, the vector dimension was validated to be the same as already inserted ones. But such comparison was made over a cache, which could lead to this “no object found” error.

That validation was changed since then and the comparison is made over a dimension property without querying the cache any more.

Ref lines:

Hello @jeronimo_irazabal - thank you for lookin into this. This makes sense then since as per our devops engineer they perform server patches then probably required some restarts.