hi @DudaNogueira
I am still facing issues, please see if I did anything wrong in the docker configuration.
Description:
I am currently configuring a Weaviate cluster with one master node and two worker node, but I am encountering issues with node communication. Below are the details of my setup and the errors I am seeing.
Master Node Configuration:
version: ‘3.7’
services:
weaviate-node-1:
command:
- --host
- 0.0.0.0
- --port
- ‘8080’
- --scheme
- http
image: cr.weaviate.io/semitechnologies/weaviate:1.26.1
ports:
- 8080:8080
- 6060:6060
- 50051:50051
- 7100:7100
- 7101:7101
- 8300:8300
restart: on-failure:0
volumes:
- ./data-node-1:/var/lib/weaviate
environment:
LOG_LEVEL: ‘debug’
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: ‘true’
PERSISTENCE_DATA_PATH: ‘/var/lib/weaviate’
ENABLE_MODULES: ‘text2vec-openai,text2vec-cohere,text2vec-huggingface,text2vec-ollama,generative-ollama’
DEFAULT_VECTORIZER_MODULE: ‘none’
CLUSTER_HOSTNAME: ‘node1’
CLUSTER_GOSSIP_BIND_PORT: ‘7100’
CLUSTER_DATA_BIND_PORT: ‘7101’
RAFT_JOIN: ‘192.168.1.52:8300,192.168.1.23:8300,192.168.1.24:8300’
RAFT_BOOTSTRAP_EXPECT: 3
Master Node Error Log:
{“action”:“raft-net”,“error”:“unknown rpc type 255”,“level”:“error”,“msg”:“raft-net failed to decode incoming command”,“time”:“2024-08-10T06:37:12Z”}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-10T06:37:21Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-10T06:37:31Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“level”:“debug”,“msg”:" memberlist: Stream connection from=192.168.1.23:36206",“time”:“2024-08-10T06:37:40Z”}
{“level”:“debug”,“msg”:" memberlist: Failed UDP ping: node2 (timeout reached)“,“time”:“2024-08-10T06:37:41Z”}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-10T06:37:41Z”,“url”:{“Scheme”:”“,“Opaque”:”“,“User”:null,“Host”:”“,“Path”:”/metrics",“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“level”:“info”,“msg”:" memberlist: Suspect node2 has failed, no acks received",“time”:“2024-08-10T06:37:41Z”}
{“level”:“debug”,“msg”:" memberlist: Failed UDP ping: node2 (timeout reached)“,“time”:“2024-08-10T06:37:43Z”}
{“level”:“info”,“msg”:” memberlist: Suspect node2 has failed, no acks received",“time”:“2024-08-10T06:37:44Z”}
{“level”:“info”,“msg”:" memberlist: Marking node2 as failed, suspect timeout reached (0 peer confirmations)“,“time”:“2024-08-10T06:37:45Z”}
{“level”:“debug”,“msg”:” memberlist: Failed UDP ping: node2 (timeout reached)“,“time”:“2024-08-10T06:37:46Z”}
{“level”:“info”,“msg”:” memberlist: Suspect node2 has failed, no acks received",“time”:“2024-08-10T06:37:48Z”}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-10T06:37:51Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-10T06:38:01Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-10T06:38:11Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
Worker Node Configuration
version: ‘3.7’
services:
weaviate-node-2:
init: true
command:
- --host
- 0.0.0.0
- --port
- ‘8080’
- --scheme
- http
image: cr.weaviate.io/semitechnologies/weaviate:1.26.1
ports:
- 8081:8080
- 6061:6060
- 50052:50051
- 7102:7102
- 7103:7103
- 8300:8300
restart: on-failure:0
volumes:
- ./data-node-2:/var/lib/weaviate
environment:
LOG_LEVEL: ‘debug’
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: ‘true’
PERSISTENCE_DATA_PATH: ‘/var/lib/weaviate’
ENABLE_MODULES: ‘text2vec-openai,text2vec-cohere,text2vec-huggingface,text2vec-ollama,generative-ollama’
DEFAULT_VECTORIZER_MODULE: ‘none’
CLUSTER_HOSTNAME: ‘node2’
CLUSTER_GOSSIP_BIND_PORT: ‘7102’
CLUSTER_DATA_BIND_PORT: ‘7103’
CLUSTER_JOIN: ‘192.168.1.52:7100’
RAFT_JOIN: ‘192.168.1.52:8300,192.168.1.23:8300,192.168.1.24:8300’
RAFT_BOOTSTRAP_EXPECT: 3
Worker Node Error Log:
{“action”:“inverted filter2search migration”,“level”:“debug”,“msg”:“starting switching fallback mode”,“time”:“2024-08-10T06:37:42Z”}
{“action”:“inverted filter2search migration”,“level”:“debug”,“msg”:“no missing filterable indexes, fallback mode skipped”,“time”:“2024-08-10T06:37:42Z”}
{“docker_image_tag”:“1.26.1”,“level”:“info”,“msg”:“configured versions”,“server_version”:“1.26.1”,“time”:“2024-08-10T06:37:42Z”}
{“action”:“grpc_startup”,“level”:“info”,“msg”:“grpc server listening at [::]:50051”,“time”:“2024-08-10T06:37:42Z”}
{“address”:“172.20.0.2:8300”,“level”:“info”,“msg”:“current Leader”,“time”:“2024-08-10T06:37:42Z”}
{“level”:“info”,“msg”:“starting migration from old schema”,“time”:“2024-08-10T06:37:42Z”}
{“level”:“info”,“msg”:“legacy schema is empty, nothing to migrate”,“time”:“2024-08-10T06:37:42Z”}
{“level”:“info”,“msg”:“migration from the old schema has been successfully completed”,“time”:“2024-08-10T06:37:42Z”}
{“action”:“restapi_management”,“docker_image_tag”:“1.26.1”,“level”:“info”,“msg”:“Serving weaviate at http://[::]:8080”,“time”:“2024-08-10T06:37:42Z”}
{“action”:“telemetry_push”,“level”:“info”,“msg”:“telemetry started”,“payload”:“\u0026{MachineID:b6f038ed-5bac-4f0d-8b9e-be97ac935689 Type:INIT Version:1.26.1 NumObjects:0 OS:linux Arch:amd64 UsedModules:}”,“time”:“2024-08-10T06:37:42Z”}
{“level”:“debug”,“msg”:" memberlist: Failed UDP ping: node1 (timeout reached)“,“time”:“2024-08-10T06:37:43Z”}
{“level”:“info”,“msg”:” memberlist: Suspect node1 has failed, no acks received",“time”:“2024-08-10T06:37:45Z”}
{“level”:“info”,“msg”:" memberlist: Marking node1 as failed, suspect timeout reached (0 peer confirmations)“,“time”:“2024-08-10T06:37:46Z”}
{“level”:“debug”,“msg”:” memberlist: Failed UDP ping: node1 (timeout reached)“,“time”:“2024-08-10T06:37:46Z”}
{“level”:“info”,“msg”:” memberlist: Suspect node1 has failed, no acks received",“time”:“2024-08-10T06:37:49Z”}
connectivity is available
telnet 192.168.1.52 8300
Trying 192.168.1.52…
Connected to 192.168.1.52.
Escape character is ‘^]’.
this is from node 2 to node 1