Can only Docker Compose be used for multi node deployment?

I want to use Docker to deploy Weaviate on two machines,The starting command for Docker is:

my node1 machine docker run this ,

docker run --name weaviate-node1 \
-p 6080:8080  \
-p 6081:6081  \
-p 6082:6082  \
-e "AUTHENTICATION_APIKEY_ENABLED=true" \
-e "AUTHENTICATION_APIKEY_ALLOWED_KEYS=A1iB0t_d4m0" \
-e "AUTHENTICATION_APIKEY_USERS=admin" \
-e "PERSISTENCE_DATA_PATH=/root/weaviate/data" \
-e "AUTHORIZATION_ADMINLIST_USERS=admin" \
-e "CLUSTER_HOSTNAME=weaviate-node1" \
-e "CLUSTER_GOSSIP_BIND_PORT=6081" \
-e "CLUSTER_DATA_BIND_PORT=6082" \
-d semitechnologies/weaviate:1.20.5

node 1 From the logs, it appears that the startup was successful

but node 2 i run this :

docker run --name weaviate-node2 \
-p 7080:8080  \
-p 7081:7081  \
-p 7082:7082  \
-e "AUTHENTICATION_APIKEY_ENABLED=true" \
-e "AUTHENTICATION_APIKEY_ALLOWED_KEYS=A1iB0t_d4m0" \
-e "AUTHENTICATION_APIKEY_USERS=admin" \
-e "PERSISTENCE_DATA_PATH=/root/weaviate/data" \
-e "AUTHORIZATION_ADMINLIST_USERS=admin" \
-e "CLUSTER_HOSTNAME=weaviate-node2" \
-e "CLUSTER_GOSSIP_BIND_PORT=7081" \
-e "CLUSTER_DATA_BIND_PORT=7082" \
-e "CLUSTER_JOIN=weaviate-node1-IP:6081" \
-d semitechnologies/weaviate:1.20.5

i got some error

{"action":"startup","error":"could not load or initialize schema: sync schema with other nodes in the cluster: read schema: open transaction: broadcast open transaction: host \"192.168.42.4:6082\": send http request: Post \"http://192.168.42.4:6082/schema/transactions/\": dial tcp 192.168.42.4:6082: connect: connection refused","level":"fatal","msg":"could not initialize schema manager","time":"2023-08-10T07:01:33Z"}

192.168.42.4 is node 2 inner ip

and node 1 i see The following logs

{"level":"info","msg":" memberlist: Suspect weaviate-node2 has failed, no acks received","time":"2023-08-10T07:01:42Z"}
{"level":"info","msg":" memberlist: Marking weaviate-node2 as failed, suspect timeout reached (0 peer confirmations)","time":"2023-08-10T07:01:46Z"}
{"level":"info","msg":" memberlist: Suspect weaviate-node2 has failed, no acks received","time":"2023-08-10T07:01:51Z"}

Where are the issues with these startup commands? Can you help me take a look? Thank you very much

Hi @codehelen ! Welcome to our community :slight_smile:

This must be some networking issue along the way, as your docker is correct (see below)

A starting point is making sure that weaviate-node2-IP can communicate with weaviate-node1-IP on all specified ports.

A good example on how to run with ad-hoc containers (at the same docker host):

First, create a docker attachable network

docker network create weaviate --atachable

note that weaviate, above, can be whatever network name you want. Make sure to change it accordingly at the --network option, on both nodes, bellow

Run node 1

docker run --name weaviate-node1 \
-p 6080:8080  \
-p 6081:6081  \
-p 6082:6082  \
--network=weaviate \
-e "AUTHENTICATION_APIKEY_ENABLED=true" \
-e "AUTHENTICATION_APIKEY_ALLOWED_KEYS=A1iB0t_d4m0" \
-e "AUTHENTICATION_APIKEY_USERS=admin" \
-e "PERSISTENCE_DATA_PATH=/root/weaviate/data" \
-e "AUTHORIZATION_ADMINLIST_USERS=admin" \
-e "CLUSTER_HOSTNAME=weaviate-node1" \
-e "CLUSTER_GOSSIP_BIND_PORT=6081" \
-e "CLUSTER_DATA_BIND_PORT=6082" \
-d semitechnologies/weaviate:1.20.5

Run node 2

docker run --name weaviate-node2 \
-p 7080:8080  \
-p 7081:7081  \
-p 7082:7082  \
--network=weaviate \
-e "AUTHENTICATION_APIKEY_ENABLED=true" \
-e "AUTHENTICATION_APIKEY_ALLOWED_KEYS=A1iB0t_d4m0" \
-e "AUTHENTICATION_APIKEY_USERS=admin" \
-e "PERSISTENCE_DATA_PATH=/root/weaviate/data" \
-e "AUTHORIZATION_ADMINLIST_USERS=admin" \
-e "CLUSTER_HOSTNAME=weaviate-node2" \
-e "CLUSTER_GOSSIP_BIND_PORT=7081" \
-e "CLUSTER_DATA_BIND_PORT=7082" \
-e "CLUSTER_JOIN=weaviate-node1:6081" \
-d semitechnologies/weaviate:1.20.5

Now you can check your two nodes up and running:

curl http://localhost:6080/v1/nodes -H "Authorization: Bearer A1iB0t_d4m0" | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   290  100   290    0     0   184k      0 --:--:-- --:--:-- --:--:--  283k
{
  "nodes": [
    {
      "gitHash": "f7c148e",
      "name": "weaviate-node1",
      "shards": null,
      "stats": {
        "objectCount": 0,
        "shardCount": 0
      },
      "status": "HEALTHY",
      "version": "1.20.5"
    },
    {
      "gitHash": "f7c148e",
      "name": "weaviate-node2",
      "shards": null,
      "stats": {
        "objectCount": 0,
        "shardCount": 0
      },
      "status": "HEALTHY",
      "version": "1.20.5"
    }
  ]
}

Let me know if this helps!

Thanks!

hi @DudaNogueira ,Thank you very much for your reply and suggestions,Eventually, I switched environments and completed the cluster deployment on two separate machines,
And I found the reason why the node was not join
Because my cluster intranet IP is 11.164.61.50 , This IP has been filtered by the IfByRFC method.
May I ask if IP filtering rules can open parameters? This intranet IP may not be that standardized
I have a few more questions and I look forward to hearing from you

  1. Weaviate version 1.21 has been released. When will Java clients be able to adapt because we really want to use the containAny syntax
  2. Can Weave Java clients connect multiple nodes simultaneously? Currently, a client can only connect to one node. If a node goes offline, the application cannot automatically connect to other nodes

Can you tell me in detail how you solved it?

Hi! Sorry, @codehelen I missed this second round of questions :frowning:

Glad you solved this.

For your questions:

1 - You should be able to use ContainsAny/All with java client allready!
2 - I believe this will need a load balancer so you can control it, as it will depend on how you will expose your cluster to your clients.

Hi,I’m in the same trouble as yours.
I’m planing to deploy two weaviate-node on two real machine, the master could start successfully but the slave node comes with a error
dial tcp 172.22.0.3:7101: connect: connection refused","level":"fatal","msg":"could not initialize schema manager

it seems like the slave cannot connect to the master’s ip on port 7101.and i’m sure that these real machine can connect to each other.
the reply you give can not make sense, I really want to know how to fix the error, very thanks.

2 Likes

Have you figured out how to solve the error?