Description
Hello!
I migrated some collections and used replication (async-enabled) for the first time. I waited some time and manually checked the shard object count to verify all my objects were migrated/replicated/synced, and after that I applied some simple environment variable changes to the Helm chart.
When the update rolled out though, each pod - one at a time - became stuck in Terminating state for several minutes. I checked the logs stream and I saw it was still performing async replication checks while it was supposed to be terminating. The whole operation took 20+ minutes and I wound up intervening to forcefully kill the pods.
Later on I did a k8s version update (for separate reasons), and the same issue happened as k8s attempted to relocate the pods to updated nodes.
Server Setup Information
- Weaviate Server Version: 1.26.5
- Deployment Method: k8s
- Number of Running Nodes: 5