Data replication issue

Description

I’ve successfully set up Weaviate nodes across three different servers, and all nodes show a healthy connection. Below are the results of my v1/nodes endpoint, as well as the schema configuration.

However, when I attempt to create an index on the original node, the other two nodes do not create the index. I also tried manually creating the index individually and storing data on the main node, but the data is not being shared across the nodes. Could you help me identify what might be missing in my configuration?

Server Setup Information

  • Weaviate Version: v.1.26.1
  • Deployment Method: docker
  • Multi Node? Number of Running Nodes: 3
  • Client Language and Version:v4
  • Multitenancy?:

nodes

{
    "nodes": [
        {
            "batchStats": {
                "queueLength": 0,
                "ratePerSecond": 12
            },
            "gitHash": "6fd2432",
            "name": "node1",
            "shards": null,
            "status": "HEALTHY",
            "version": "1.26.1"
        },
        {
            "batchStats": {
                "queueLength": 0,
                "ratePerSecond": 12
            },
            "gitHash": "6fd2432",
            "name": "node2",
            "shards": null,
            "status": "HEALTHY",
            "version": "1.26.1"
        },
        {
            "batchStats": {
                "queueLength": 0,
                "ratePerSecond": 0
            },
            "gitHash": "6fd2432",
            "name": "node3",
            "shards": null,
            "status": "HEALTHY",
            "version": "1.26.1"
        }
    ]
}

schema configuration

collection = client.collections.create(
            collection_name,
            vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_ollama(
               api_endpoint=model_endpoint,
               model=vectorizer_model
            ),
            replication_config=wvc.config.Configure.replication(
                factor=3,

            ),

            properties=[my propeties]

index schema replication and sharding are in my index

"replicationConfig": {
                "asyncEnabled": false,
                "factor": 3
            },
            "shardingConfig": {
                "actualCount": 3,
                "actualVirtualCount": 384,
                "desiredCount": 3,
                "desiredVirtualCount": 384,
                "function": "murmur3",
                "key": "_id",
                "strategy": "hash",
                "virtualPerPhysical": 128
            },

hi @Mariam !!

Do you see anything out of ordinary on logs?

Your configuration seems fine, so it should be replicating to all notes.

here is my logs:

{“level”:“debug”,“msg”:" memberlist: Initiating push/pull sync with: node3 192.168.1.24:7100",“time”:“2024-08-26T15:06:10Z”}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:06:11Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:06:21Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:06:31Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“level”:“debug”,“msg”:" memberlist: Initiating push/pull sync with: node2 192.168.1.23:7100",“time”:“2024-08-26T15:06:40Z”}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:06:41Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:06:51Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:07:01Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“level”:“debug”,“msg”:" memberlist: Initiating push/pull sync with: node3 192.168.1.24:7100",“time”:“2024-08-26T15:07:10Z”}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:07:11Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“level”:“debug”,“msg”:" memberlist: Stream connection from=192.168.1.24:58996",“time”:“2024-08-26T15:07:14Z”}
{“level”:“debug”,“msg”:" memberlist: Stream connection from=192.168.1.23:58336",“time”:“2024-08-26T15:07:18Z”}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:07:21Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:07:31Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“level”:“debug”,“msg”:" memberlist: Initiating push/pull sync with: node2 192.168.1.23:7100",“time”:“2024-08-26T15:07:40Z”}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:07:41Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“level”:“debug”,“msg”:" memberlist: Stream connection from=192.168.1.24:36854",“time”:“2024-08-26T15:07:44Z”}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:07:51Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:08:01Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}
{“level”:“debug”,“msg”:" memberlist: Initiating push/pull sync with: node3 192.168.1.24:7100",“time”:“2024-08-26T15:08:10Z”}
{“action”:“restapi_request”,“level”:“debug”,“method”:“GET”,“msg”:“received HTTP request”,“time”:“2024-08-26T15:08:11Z”,“url”:{“Scheme”:“”,“Opaque”:“”,“User”:null,“Host”:“”,“Path”:“/metrics”,“RawPath”:“”,“OmitHost”:false,“ForceQuery”:false,“RawQuery”:“”,“Fragment”:“”,“RawFragment”:“”}}

can you share your docker compose?

hi @DudaNogueira ,

here is the docker configuration

master node
version: ‘3.7’
services:
weaviate-node-1:
init: true
network_mode: “host”
command:
- --host
- 0.0.0.0
- --port
- ‘8080’
- --scheme
- http
image: cr.weaviate.io/semitechnologies/weaviate:1.26.1
ports:
- 8080:8080
- 6060:6060
- 50051:50051/tcp
- 50051:50051/udp
- 7100:7100/tcp
- 7100:7100/udp
- 7101:7101/tcp
- 7101:7101/udp
- 8300:8300/tcp
- 8300:8300/udp
- 8301:8301/tcp
restart: on-failure:0
volumes:
- ./data-node-1:/var/lib/weaviate
environment:
LOG_LEVEL: ‘debug’
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: ‘true’
PERSISTENCE_DATA_PATH: ‘/var/lib/weaviate’
ENABLE_MODULES: ‘text2vec-ollama,generative-ollama’
DEFAULT_VECTORIZER_MODULE: ‘none’
CLUSTER_HOSTNAME: ‘node1’
CLUSTER_GOSSIP_BIND_PORT: ‘7100’
CLUSTER_DATA_BIND_PORT: ‘7101’
RAFT_JOIN: ‘192.168.1.52:8300,192.168.1.23:8300,192.168.1.24:8300’
RAFT_BOOTSTRAP_EXPECT: 3
REPLICATION_FACTOR: ‘2’
REPLICATION_CONSISTENCY: ‘QUORUM’

Hello, could you please share the Docker commands for setting up a cluster across multiple machines? Any help would be greatly appreciated.

Hi @jasper2077,

On two or three different servers, you should create a docker-compose.yml file like in the above example with the corresponding configurations for each server. Update the IPs and set the REPLICATION_FACTOR to 2 or 3(based on how many replications you want to have), and specify ENABLE_MODULES with your preferred models or custom models. Run the same docker-compose.yml file on both servers using the command:

docker-compose up

Once Weaviate is up, define your index in the code as follows:

    collection_name,
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_ollama(
        api_endpoint=model_endpoint,
        model=vectorizer_model
    ),
    replication_config=wvc.config.Configure.replication(
        factor=3,  # Adjust based on your setup
    ),
    properties=[my_properties]  # Replace with your actual properties
)
1 Like