Description
Hi,
I’m new to weaviate and I am trying to deploy a multi-node setup using docker-compose for testing :
services:
weaviate-node-11:
init: true
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image: cr.weaviate.io/semitechnologies/weaviate:1.25.4
ports:
- 8080:8080
- 50051:50051
- 6060:6060
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
DEFAULT_VECTORIZER_MODULE: 'none'
ENABLE_MODULES: 'text2vec-openai,text2vec-cohere,text2vec-huggingface'
CLUSTER_HOSTNAME: 'node1'
CLUSTER_GOSSIP_BIND_PORT: '7100'
CLUSTER_DATA_BIND_PORT: '7101'
HTTP_PROXY: ''
http_proxy: ''
LOG_LEVEL: 'debug'
weaviate-node-12:
init: true
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image: cr.weaviate.io/semitechnologies/weaviate:1.25.4
ports:
- 8081:8080
- 50052:50051
- 6061:6060
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
DEFAULT_VECTORIZER_MODULE: 'none'
ENABLE_MODULES: 'text2vec-openai,text2vec-cohere,text2vec-huggingface'
CLUSTER_HOSTNAME: 'node2'
CLUSTER_GOSSIP_BIND_PORT: '7102'
CLUSTER_DATA_BIND_PORT: '7103'
CLUSTER_JOIN: 'weaviate-node-11:7100'
HTTP_PROXY: ''
http_proxy: ''
LOG_LEVEL: 'debug'
After starting, I found that the /v1/nodes
endpoint seems to return results normally.
{
"nodes": [{
"batchStats": {
"queueLength": 0,
"ratePerSecond": 0
},
"gitHash": "a61909a",
"name": "node1",
"shards": null,
"status": "HEALTHY",
"version": "1.25.4"
}, {
"batchStats": {
"queueLength": 0,
"ratePerSecond": 0
},
"gitHash": "a61909a",
"name": "node2",
"shards": null,
"status": "HEALTHY",
"version": "1.25.4"
}]
}
Next, I used the Python client to create a collection.
import weaviate
import weaviate.classes as wvc
import os
client = weaviate.connect_to_custom(
http_host="localhost",
http_port=8080,
http_secure=False,
grpc_host="localhost",
grpc_port=50051,
grpc_secure=False,
# headers={
# "X-OpenAI-Api-Key": os.environ["OPENAI_APIKEY"] # Replace with your inference API key
# }
)
try:
questions = client.collections.create(
name="Question",
sharding_config=wvc.config.Configure.sharding(
desired_count=3
),
replication_config=wvc.config.Configure.replication(
factor=2
),
vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(), # Set the vectorizer to "text2vec-openai" to use the OpenAI API for vector-related operations
generative_config=wvc.config.Configure.Generative.cohere(), # Set the generative module to "generative-cohere" to use the Cohere API for RAG
properties=[
wvc.config.Property(
name="question",
data_type=wvc.config.DataType.TEXT,
),
wvc.config.Property(
name="answer",
data_type=wvc.config.DataType.TEXT,
),
],
# Configure the vector index
vector_index_config=wvc.config.Configure.VectorIndex.hnsw( # Or `flat` or `dynamic`
distance_metric=wvc.config.VectorDistances.COSINE,
quantizer=wvc.config.Configure.VectorIndex.Quantizer.bq(),
),
# Configure the inverted index
inverted_index_config=wvc.config.Configure.inverted_index(
index_null_state=True,
index_property_length=True,
index_timestamps=True,
),
)
finally:
client.close()
After creation, I found that all the shards were concentrated on the node from which I made the call, and the other node did not have the corresponding collection.
http://localhost:8080/v1/schema/Question/shards
shows that:
[{
"name": "pklJTouifT37",
"status": "READY",
"vectorQueueSize": 0
}, {
"name": "1gTDzdO9guT0",
"status": "READY",
"vectorQueueSize": 0
}, {
"name": "IOgYO9o0RmDG",
"status": "READY",
"vectorQueueSize": 0
}]
while the other node http://localhost:8081/v1/schema/Question/shards
returns:
{
"error": [{
"message": "cannot get shards status for a non-existing index for Question"
}]
}
Then, I tried to import data:
import weaviate
import json
client = weaviate.Client(
url="http://localhost:8080/", # Replace with your Weaviate endpoint
additional_headers={
"X-OpenAI-Api-Key": "YOUR-OPENAI-API-KEY" # Or "X-Cohere-Api-Key" or "X-HuggingFace-Api-Key"
}
)
# ===== import data =====
# Load data
import requests
url = 'https://raw.githubusercontent.com/weaviate-tutorials/quickstart/main/data/jeopardy_tiny.json'
resp = requests.get(url)
data = json.loads(resp.text)
# Prepare a batch process
client.batch.configure(batch_size=100) # Configure batch
with client.batch as batch:
# Batch import all Questions
for i, d in enumerate(data):
# print(f"importing question: {i+1}") # To see imports
properties = {
"answer": d["Answer"],
"question": d["Question"],
"category": d["Category"],
}
batch.add_data_object(properties, "Question")
And encountered the following error, it seems that weaviate cannot found the class on the other node.
2024-06-19 15:31:46 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"status code: 500, error: digest objects: local index \"Question\" not found\n: context deadline exceeded","op":"get","replica":"172.23.0.2:7103","shard":"pklJTouifT37","time":"2024-06-19T07:31:46Z","uuid":"7bc9bf37-0d63-40c7-9c63-70d10aea7683"}
2024-06-19 15:31:56 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"status code: 500, error: digest objects: local index \"Question\" not found\n: context deadline exceeded","op":"get","replica":"172.23.0.2:7103","shard":"pklJTouifT37","time":"2024-06-19T07:31:56Z","uuid":"b5565857-9e61-40a1-88b8-7b0bbb82c139"}
2024-06-19 15:32:06 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"status code: 500, error: digest objects: local index \"Question\" not found\n: context deadline exceeded","op":"get","replica":"172.23.0.2:7103","shard":"pklJTouifT37","time":"2024-06-19T07:32:06Z","uuid":"d4f7e703-2e07-47ad-8615-a4cc0da979ea"}
2024-06-19 15:32:16 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"status code: 500, error: digest objects: local index \"Question\" not found\n: context deadline exceeded","op":"get","replica":"172.23.0.2:7103","shard":"pklJTouifT37","time":"2024-06-19T07:32:16Z","uuid":"dd5e7784-129f-42dc-ba3a-2e2665812359"}
2024-06-19 15:32:26 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"status code: 500, error: digest objects: local index \"Question\" not found\n: context deadline exceeded","op":"get","replica":"172.23.0.2:7103","shard":"pklJTouifT37","time":"2024-06-19T07:32:26Z","uuid":"609a7b08-a4df-4664-ae5e-d5a0952c9e4c"}
2024-06-19 15:32:35 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"status code: 500, error: digest objects: local index \"Question\" not found\n: context canceled","op":"get","replica":"172.23.0.2:7103","shard":"1gTDzdO9guT0","time":"2024-06-19T07:32:35Z","uuid":"6f1b6157-1eb9-4a71-81dc-fe0ac4550910"}
2024-06-19 15:32:35 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"connect: Get \"http://172.23.0.2:7103/replicas/indices/Question/shards/1gTDzdO9guT0/objects/_digest?schema_version=0\": context canceled","op":"get","replica":"172.23.0.2:7103","shard":"1gTDzdO9guT0","time":"2024-06-19T07:32:35Z","uuid":"ba872283-31a9-46fa-bcae-83a24bfd750c"}
2024-06-19 15:32:35 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"connect: Get \"http://172.23.0.2:7103/replicas/indices/Question/shards/IOgYO9o0RmDG/objects/_digest?schema_version=0\": context canceled","op":"get","replica":"172.23.0.2:7103","shard":"IOgYO9o0RmDG","time":"2024-06-19T07:32:35Z","uuid":"0473a789-e36b-4b1e-b97f-d3a7379a54a8"}
2024-06-19 15:32:35 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"connect: Get \"http://172.23.0.2:7103/replicas/indices/Question/shards/pklJTouifT37/objects/_digest?schema_version=0\": context canceled","op":"get","replica":"172.23.0.2:7103","shard":"pklJTouifT37","time":"2024-06-19T07:32:35Z","uuid":"a94ed56b-625d-4f92-b214-974bee6d4545"}
2024-06-19 15:32:35 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"connect: Get \"http://172.23.0.2:7103/replicas/indices/Question/shards/IOgYO9o0RmDG/objects/_digest?schema_version=0\": context canceled","op":"get","replica":"172.23.0.2:7103","shard":"IOgYO9o0RmDG","time":"2024-06-19T07:32:35Z","uuid":"95e0ce69-c5be-46fa-bdd0-30c2d68916c5"}
2024-06-19 15:32:35 weaviate-weaviate-node-11-1 | {"description":"An I/O timeout occurs when the request takes longer than the specified server-side timeout.","error":"write tcp 172.23.0.3:8080-\u003e172.23.0.1:57940: i/o timeout","hint":"Either try increasing the server-side timeout using e.g. '--write-timeout=600s' as a command line flag when starting Weaviate, or try sending a computationally cheaper request, for example by reducing a batch size, reducing a limit, using less complex filters, etc. Note that this error is only thrown if client-side and server-side timeouts are not in sync, more precisely if the client-side timeout is longer than the server side timeout.","level":"error","method":"POST","msg":"i/o timeout","path":{"Scheme":"","Opaque":"","User":null,"Host":"","Path":"/v1/batch/objects","RawPath":"","OmitHost":false,"ForceQuery":false,"RawQuery":"","Fragment":"","RawFragment":""},"time":"2024-06-19T07:32:35Z"}
2024-06-19 15:32:47 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"status code: 500, error: digest objects: local index \"Question\" not found\n: context deadline exceeded","op":"exists","replica":"172.23.0.2:7103","shard":"pklJTouifT37","time":"2024-06-19T07:32:47Z","uuid":"7bc9bf37-0d63-40c7-9c63-70d10aea7683"}
2024-06-19 15:32:47 weaviate-weaviate-node-11-1 | {"action":"requests_total","api":"rest","class_name":"Question","error":"msg:repo.exists code:500 err:cannot achieve consistency level \"QUORUM\": read error","level":"error","msg":"unexpected error","query_type":"objects","time":"2024-06-19T07:32:47Z"}
2024-06-19 15:32:57 weaviate-weaviate-node-11-1 | {"class":"Question","level":"error","msg":"status code: 500, error: digest objects: local index \"Question\" not found\n: context deadline exceeded","op":"get","replica":"172.23.0.2:7103","shard":"pklJTouifT37","time":"2024-06-19T07:32:57Z","uuid":"7bc9bf37-0d63-40c7-9c63-70d10aea7683"}
2024-06-19 15:32:57 weaviate-weaviate-node-11-1 | {"action":"requests_total","api":"rest","class_name":"","error":"repo: object by id: search index question: cannot achieve consistency level \"QUORUM\": read error","level":"error","msg":"unexpected error","query_type":"objects","time":"2024-06-19T07:32:57Z"}
I noticed that /v1/cluster/statistics
shows both nodes as leaders and synchronized
is false
:
{
"statistics": [{
"bootstrapped": true,
"candidates": {},
"dbLoaded": true,
"isVoter": true,
"leaderAddress": "172.23.0.3:8300",
"leaderId": "node1",
"name": "node1",
"open": true,
"raft": {
"appliedIndex": "7",
"commitIndex": "7",
"fsmPending": "0",
"lastContact": "0",
"lastLogIndex": "7",
"lastLogTerm": "2",
"lastSnapshotIndex": "0",
"lastSnapshotTerm": "0",
"latestConfiguration": [{
"address": "172.23.0.3:8300",
"id": "node1",
"suffrage": 0
}],
"latestConfigurationIndex": "0",
"numPeers": "0",
"protocolVersion": "3",
"protocolVersionMax": "3",
"protocolVersionMin": "0",
"snapshotVersionMax": "1",
"snapshotVersionMin": "0",
"state": "Leader",
"term": "2"
},
"ready": true,
"status": "HEALTHY"
}, {
"bootstrapped": true,
"candidates": {},
"dbLoaded": true,
"isVoter": true,
"leaderAddress": "172.23.0.2:8300",
"leaderId": "node2",
"name": "node2",
"open": true,
"raft": {
"appliedIndex": "2",
"commitIndex": "2",
"fsmPending": "0",
"lastContact": "0",
"lastLogIndex": "2",
"lastLogTerm": "2",
"lastSnapshotIndex": "0",
"lastSnapshotTerm": "0",
"latestConfiguration": [{
"address": "172.23.0.2:8300",
"id": "node2",
"suffrage": 0
}],
"latestConfigurationIndex": "0",
"numPeers": "0",
"protocolVersion": "3",
"protocolVersionMax": "3",
"protocolVersionMin": "0",
"snapshotVersionMax": "1",
"snapshotVersionMin": "0",
"state": "Leader", <-- both are leaders
"term": "2"
},
"ready": true,
"status": "HEALTHY"
}],
"synchronized": false <-- Why it cannot be synchronized?
}
What could be the reason that the collection shards are not synchronizing between nodes?
Server Setup Information
- Weaviate Server Version: 1.25.4
- Deployment Method: docker/binary
- Multi Node? Number of Running Nodes: 2
- Client Language and Version: python 3.8
- Multitenancy?: No
Any additional Information
env : Mac ARM64