Thank you for the reply. The documentation you referenced states that the CLUSTER_JOIN
variable should only be set for nodes other than the founding node. However, my issue is that the founding node does not reconnect to the cluster after a restart. The rest of the nodes are functioning as expected.
Below is the configuration of environment variables we use for all nodes (we are deploying using Pulumi):
environment: [
{ name: 'QUERY_DEFAULTS_LIMIT', value: '25' },
{ name: 'AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED', value: 'true' },
{ name: 'PERSISTENCE_DATA_PATH', value: '/var/lib/weaviate' },
{ name: 'DEFAULT_VECTORIZER_MODULE', value: 'none' },
{ name: 'ENABLE_API_BASED_MODULES', value: 'true' },
{ name: 'LOG_LEVEL', value: 'debug' },
{ name: 'CLUSTER_HOSTNAME', value: nodeName },
{ name: 'CLUSTER_GOSSIP_BIND_PORT', value: gossipBindPort },
{ name: 'CLUSTER_DATA_BIND_PORT', value: dataBindPort },
{ name: 'RAFT_JOIN', value: nodeJoinList }, // "node1,node2,node3"
{
name: 'RAFT_BOOTSTRAP_EXPECT',
value: WeaviateService.NODE_NAMES.length.toString(),
},
{
name: 'REPLICATION_MINIMUM_FACTOR',
value: WeaviateService.NODE_NAMES.length.toString(),
},
{ name: 'DEFAULT_SHARD_COUNT', value: '1' },
...(isFirstNode
? []
: [
{
name: 'CLUSTER_JOIN',
value: `node1.weaviate.local:${WeaviateService.BASE_GOSSIP_PORT}`,
},
]),
],
We are using AWS Cloud Map
to ensure proper IP resolution between nodes. This is why node1.weaviate.local:${WeaviateService.BASE_GOSSIP_PORT}
is used for the CLUSTER_JOIN
variable.