there are no special steps required. What I’ve discovered is that when using persistent storage (either a Docker volume or a bound local directory), the database only works correctly with the original container it was initialized with.
It appears that some crucial data is stored outside the designated persistent data directory. As a result, simply reusing the same volume with a new container — even if it’s based on the same image version — leads to issues. The code I posted in my initial message will then return False.
To reproduce the issue, just delete the original container, spin up a new one with the same image version, and mount the previously used volume. The problem should reoccur.
I haven’t seen any error messages or hints indicating what exactly went wrong. But to be fair, I’m not an expert when it comes to digging into Docker containers.
I compared both containers and eventually noticed a difference. The second container — the one I created later and attached to the data directory of the first — shows recurring error messages in its Docker log.
{“build_git_commit”:“927897e”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.2”,“build_wv_version”:“1.29.2”,“level”:“info”,“msg”:“attempting to join”,“remoteNodes”:{“70cf1899cd62”:“172.17.0.4:8300”},“time”:“2025-03-31T18:28:23Z”}
{“build_git_commit”:“927897e”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.2”,“build_wv_version”:“1.29.2”,“level”:“info”,“msg”:“attempted to join and failed”,“remoteNode”:“172.17.0.4:8300”,“status”:8,“time”:“2025-03-31T18:28:23Z”}
{“build_git_commit”:“927897e”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.2”,“build_wv_version”:“1.29.2”,“level”:“info”,“msg”:“attempting to join”,“remoteNodes”:{“70cf1899cd62”:“172.17.0.4:8300”},“time”:“2025-03-31T18:28:24Z”}
{“build_git_commit”:“927897e”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.2”,“build_wv_version”:“1.29.2”,“level”:“info”,“msg”:“attempted to join and failed”,“remoteNode”:“172.17.0.4:8300”,“status”:8,“time”:“2025-03-31T18:28:24Z”}
…
but finally I don’t want to run two instances at the same time. What if I setup a new Instance for update or some other reason. I don’t want to vectorize all my data again. I want to reuse my old datadir for sure.
If you want to migrate your data from a single cluster to a multi node cluster,
you will need to spin up the new cluster first, create the collection accordingly (making sure to to set the replication factor) and the migrate your data over.