In short, if anyone finds this issue occurring:
weaviate images are built on Alpine Linux which uses the musl
DNS resolver, rather than the more standard libc
based one.
musl
can behave weirdly when DNS isn’t configured specifically for it in the K8s environment.
The solution to the problem is for weaviate to only use FQDNs for communicating between nodes, which in principle means changing all the CLUSTER_JOIN
environment variables to be weaviate-headless.{{release.namespace}}.svc.cluster.local.
<< The extra .
on the end is the fix.