We are testing 3-node cluster (replication factor 3), all data synchronized and… suddenly one node physically failed. The new pod (empty) was deployed on a new machine and activated “read repair” procedure (using batch reading), but It work too slowly ~50obj/sec (how to speedup it?). Problem: when search request (text/vector) is routed to “empty” node (with any QUORUM or ALL consistency level) - not valid result avalaible. It seems that the “empty” node explicitly looks for data within itself at first… not founds them and quorum/all conditions are not satisfied. At result:
for “property equal” search (getting objects by id’s list at once with “ContainsAny”) - not any results returned;
for vector search - not self and not closest vectors returned.
If request routed to any node “with this data”, search results are correct.
How to exclude “empty” (not replicated) node from operations or QUORUM/ALL conditions?
Server Setup Information
Weaviate Server Version: 1.25.12
Deployment Method: k8s
Multi Node? Number of Running Nodes: Yes, 3
Client Language and Version: Python3, Python Client v3
Pods succesfully started with 1.26.3 image version;
Manually updated schema with “asyncEnabled”:true;
In logs appeared information about async replication starand crash: {“action”:“async_replication”,“build_git_commit”:“9a4ea6d”,“build_go_version”:“go1.21.13”,“build_image_tag”:“1.26.3”,“build_wv_version”:“1.26.3”,“class_name”:“MyClass”,“level”:“info”,“msg”:“hashtree initialization is progress…”,“object_count”:14842232,“shard_name”:“QZOXu2zY6rKM”,“time”:“2024-09-05T09:49:23Z”} {“action”:“async_replication”,“build_git_commit”:“9a4ea6d”,“build_go_version”:“go1.21.13”,“build_image_tag”:“1.26.3”,“build_wv_version”:“1.26.3”,“class_name”:“MyClass”,“level”:“info”,“msg”:“hashbeater stopped”,“shard_name”:“b4ulnsitcsN2”,“time”:“2024-09-05T09:49:24Z”} {“build_git_commit”:“9a4ea6d”,“build_go_version”:“go1.21.13”,“build_image_tag”:“1.26.3”,“build_wv_version”:“1.26.3”,“level”:“error”,“msg”:“Recovered from panic: runtime error: index out of range [1] with length 1”,“time”:“2024-09-05T09:49:24Z”} goroutine 692758 [running]: runtime/debug.Stack()
/go/src/github.com/weaviate/weaviate/entities/errors/go_wrapper.go:26 +0x79* {“action”:“async_replication”,“build_git_commit”:“9a4ea6d”,“build_go_version”:“go1.21.13”,“build_image_tag”:“1.26.3”,“build_wv_version”:“1.26.3”,“class_name”:“MyClass”,“level”:“info”,“msg”:“hashbeater stopped”,“shard_name”:“PfYchyFAyseL”,“time”:“2024-09-05T09:49:24Z”} {“build_git_commit”:“9a4ea6d”,“build_go_version”:“go1.21.13”,“build_image_tag”:“1.26.3”,“build_wv_version”:“1.26.3”,“level”:“error”,“msg”:“Recovered from panic: runtime error: index out of range [1] with length 1”,“time”:“2024-09-05T09:49:24Z”} goroutine 662155 [running]: runtime/debug.Stack()
/go/src/github.com/weaviate/weaviate/entities/errors/go_wrapper.go:26 +0x79* {“action”:“async_replication”,“build_git_commit”:“9a4ea6d”,“build_go_version”:“go1.21.13”,“build_image_tag”:“1.26.3”,“build_wv_version”:“1.26.3”,“class_name”:“MyClass”,“level”:“info”,“msg”:“hashtree initialization is progress…”,“object_count”:15064448,“shard_name”:“QZOXu2zY6rKM”,“time”:“2024-09-05T09:49:24Z”} {“action”:“async_replication”,“build_git_commit”:“9a4ea6d”,“build_go_version”:“go1.21.13”,“build_image_tag”:“1.26.3”,“build_wv_version”:“1.26.3”,“class_name”:“MyClass”,“level”:“info”,“msg”:“hashtree successfully initialized”,“shard_name”:“QZOXu2zY6rKM”,“time”:“2024-09-05T09:49:24Z”} {“action”:“async_replication”,“build_git_commit”:“9a4ea6d”,“build_go_version”:“go1.21.13”,“build_image_tag”:“1.26.3”,“build_wv_version”:“1.26.3”,“class_name”:“MyClass”,“level”:“info”,“msg”:“hashbeater started…”,“shard_name”:“QZOXu2zY6rKM”,“time”:“2024-09-05T09:49:24Z”} {“action”:“async_replication”,“build_git_commit”:“9a4ea6d”,“build_go_version”:“go1.21.13”,“build_image_tag”:“1.26.3”,“build_wv_version”:“1.26.3”,“class_name”:“MyClass”,“level”:“info”,“msg”:“hashbeater stopped”,“shard_name”:“QZOXu2zY6rKM”,“time”:“2024-09-05T09:49:28Z”} {“build_git_commit”:“9a4ea6d”,“build_go_version”:“go1.21.13”,“build_image_tag”:“1.26.3”,“build_wv_version”:“1.26.3”,“level”:“error”,“msg”:“Recovered from panic: runtime error: index out of range [1] with length 1”,“time”:“2024-09-05T09:49:28Z”} goroutine 695187 [running]: runtime/debug.Stack()