Hybrid search error "sparse search: object search at index article: local shard object"

After updating to 1.23.7 I get an internal error when searching (an improvement on a panic/crash with 1.23.6!), which is related to Handle internal errors in BM25/WAND logic more gracefully by etiennedi · Pull Request #4100 · weaviate/weaviate · GitHub.

Deployment:

  • GCP GKE via Helm
  • 5 node, collections have 3 replicas

Our query is quite simple:

{
  Get {
    Article(
      hybrid: {
        query: "mosaic tiles"
        alpha: 0.5
      }
      limit: 10
    ) {
      documentId
      name
      site
      url
      tags
      _additional {
        id
      }
    }
  }
}

The error we get when using hybrid search is:

{
  "data": {
    "Get": {
      "Article": null
    }
  },
  "errors": [
    {
      "locations": [
        {
          "column": 5,
          "line": 3
        }
      ],
      "message": "sparse search: object search at index article: local shard object search article_7B4lDPm6kOGu: wand: an internal error occurred during BM25 search",
      "path": [
        "Get",
        "Article"
      ]
    }
  ]
}

Is there anything we can do to repair the BM25 index or shards?

NB: Pure vector searches are fine, as our searches with WHERE filters

1 Like

Seems to be ok now, still interested in any remedial action that can be taken when we see this sort of error in production going forward, it may indicate a problem that needs addressing.

Spoke to soon!

I’m getting an index out of range now

Error:

{
    shard: "7B4lDPm6kOGu"
    prop_names: [
        0: "documentId"
        1: "name"
        2: "type"
        // trimmed
    ]
    msg: "panic: runtime error: index out of range [118] with length 117"
    has_filter: false
    query_term: "tiles"
    action: "bm25_search"
    level: "error"
    class: "Article"
}

Stack track over N logs:

{action: bm25_search, class: Article, has_filter: false, level: error, msg: panic: runtime error: index out of range [118] with length 117, prop_names: […], query_term: tiles, shard: 7B4lDPm6kOGu}
2024-01-31 10:57:50.289 GMT
goroutine 18924891 [running]:
2024-01-31 10:57:50.289 GMT
runtime/debug.Stack()
2024-01-31 10:57:50.289 GMT
	/usr/local/go/src/runtime/debug/stack.go:24 +0x5e
2024-01-31 10:57:50.289 GMT
runtime/debug.PrintStack()
2024-01-31 10:57:50.289 GMT
	/usr/local/go/src/runtime/debug/stack.go:16 +0x13
2024-01-31 10:57:50.289 GMT
github.com/weaviate/weaviate/adapters/repos/db/inverted.(*BM25Searcher).wand.func1.1()
2024-01-31 10:57:50.289 GMT
	/go/src/github.com/weaviate/weaviate/adapters/repos/db/inverted/bm25_searcher.go:210 +0x33e
2024-01-31 10:57:50.289 GMT
panic({0x1899720?, 0xc003972db0?})
2024-01-31 10:57:50.289 GMT
	/usr/local/go/src/runtime/panic.go:914 +0x21f
2024-01-31 10:57:50.289 GMT
github.com/weaviate/weaviate/adapters/repos/db/inverted.(*BM25Searcher).createTerm(0xc01a309620, 0x4123e96e00000000, {0x0, 0x0}, {0xc01ddc75a7, 0x5}, {0xc01e5ca000, 0x18, 0x20?}, 0xc09d4331d0?, ...)
2024-01-31 10:57:50.289 GMT
	/go/src/github.com/weaviate/weaviate/adapters/repos/db/inverted/bm25_searcher.go:446 +0x126e
2024-01-31 10:57:50.289 GMT
github.com/weaviate/weaviate/adapters/repos/db/inverted.(*BM25Searcher).wand.func1()
2024-01-31 10:57:50.289 GMT
	/go/src/github.com/weaviate/weaviate/adapters/repos/db/inverted/bm25_searcher.go:215 +0x1d9
2024-01-31 10:57:50.289 GMT
golang.org/x/sync/errgroup.(*Group).Go.func1()
2024-01-31 10:57:50.289 GMT
	/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75 +0x56
2024-01-31 10:57:50.289 GMT
created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 18453453
2024-01-31 10:57:50.289 GMT
	/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:72 +0x96
2024-01-31 10:57:51.000 GMT
{action: bm25_search, class: Article, has_filter: false, level: error, msg: panic: runtime error: index out of range [118] with length 117, prop_names: […], query_term: tiles, shard: 7B4lDPm6kOGu}

Hi @jbendotnet !

That right. This is an ongoing issue and our team is investigating.

For now, that PR will avoid the whole cluster crashing, while providing valuable information for the investigation.

We should have a fix for that soon :crossed_fingers:

Thanks for pointing it out!

1 Like

This issue covers this, I believe: BM25 Index Corruption / BM25 not crash-resistant · Issue #4125 · weaviate/weaviate · GitHub

1 Like