Description
Hi guys,
I am trying to setup weaviate as a single node in AWS ECS Fargate. I want to use an EFS to store Weaviate date if the ECS tasks needs to restart etc.
Anyhow, when using “PERSISTENCE_DATA_PATH” = “/var/lib/weaviate” as an environment parameter in the Task definition, I am repeatedly getting the error “attempted to join and failed” after task startup until the task ultimately fails.
It seems like it works again after I delete all contents from the EFS - until the task gets restartet, then the error comes up again.
Drives me crazy, I would be really happy if you could help me… Below you will find my configurations.
Server Setup Information
- Weaviate Server Version: 1.29
- Deployment Method:
- Multi Node? No
- Client Language and Version:
- Multitenancy?: No
Any additional Information
This is my ECS Fargate Task definition for the ECR container (which contains an umodified image of weaviate 1.29):
{
“family”: “weaviate-task”,
“containerDefinitions”: [
{
“name”: “weaviate”,
“image”: “xxx.dkr.ecr.eu-west-1.amazonaws.com/weaviate:latest”,
“cpu”: 0,
“memoryReservation”: 2048,
“portMappings”: [
{
“containerPort”: 8080,
“hostPort”: 8080,
“protocol”: “tcp”
},
{
“containerPort”: 50051,
“hostPort”: 50051,
“protocol”: “tcp”
},
{
“containerPort”: 8300,
“hostPort”: 8300,
“protocol”: “tcp”
}
],
“essential”: true,
“environment”: [
{
“name”: “AZURE_APIKEY”,
“value”: “xxx”
},
{
“name”: “http_proxy”,
“value”: “xxx:8080”
},
{
“name”: “no_proxy”,
“value”: “xxx,localhost,127.0.0.1,xxx”
},
{
“name”: “ENABLE_MODULES”,
“value”: “text2vec-azure-openai”
},
{
“name”: “https_proxy”,
“value”: “xxx”
},
{
“name”: “PERSISTENCE_DATA_PATH”,
“value”: “/var/lib/weaviate”
},
{
“name”: “DEPLOYMENT_ID”,
“value”: “xxx”
},
{
“name”: “RESOURCE_NAME”,
“value”: “xxx”
}
],
“mountPoints”: [
{
“sourceVolume”: “weaviate-efs-volume”,
“containerPath”: “/var/lib/weaviate”,
“readOnly”: false
}
],
“volumesFrom”: ,
“logConfiguration”: {
“logDriver”: “awslogs”,
“options”: {
“awslogs-group”: “/ecs/weaviate-task”,
“mode”: “non-blocking”,
“awslogs-create-group”: “true”,
“max-buffer-size”: “25m”,
“awslogs-region”: “eu-west-1”,
“awslogs-stream-prefix”: “ecs”
}
},
“systemControls”:
}
],
“executionRoleArn”: “arn:aws:iam::xxx:role/ecsTaskExecutionRole”,
“networkMode”: “awsvpc”,
“volumes”: [
{
“name”: “weaviate-efs-volume”,
“efsVolumeConfiguration”: {
“fileSystemId”: “fs-xxx”,
“rootDirectory”: “/”
}
}
],
“placementConstraints”: ,
“requiresCompatibilities”: [
“FARGATE”
],
“cpu”: “1024”,
“memory”: “3072”,
“runtimePlatform”: {
“cpuArchitecture”: “X86_64”,
“operatingSystemFamily”: “LINUX”
},
“enableFaultInjection”: false
}
Here are the errors from CloudWatch that come up repeatedly until the container shuts down:
2025-03-03T21:35:02.599Z
{“build_git_commit”:“35d800d”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.0”,“build_wv_version”:“1.29.0”,“level”:“info”,“msg”:“attempting to join”,“remoteNodes”:[“10.22.122.166:8300”],“time”:“2025-03-03T21:35:02Z”}
2025-03-03T21:35:02.600Z
{“build_git_commit”:“35d800d”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.0”,“build_wv_version”:“1.29.0”,“level”:“info”,“msg”:“attempted to join and failed”,“remoteNode”:“10.22.122.166:8300”,“status”:14,“time”:“2025-03-03T21:35:02Z”}
2025-03-03T21:35:03.600Z
{“build_git_commit”:“35d800d”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.0”,“build_wv_version”:“1.29.0”,“level”:“info”,“msg”:“attempting to join”,“remoteNodes”:[“10.22.122.166:8300”],“time”:“2025-03-03T21:35:03Z”}
2025-03-03T21:35:03.601Z
{“build_git_commit”:“35d800d”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.0”,“build_wv_version”:“1.29.0”,“level”:“info”,“msg”:“attempted to join and failed”,“remoteNode”:“10.22.122.166:8300”,“status”:14,“time”:“2025-03-03T21:35:03Z”}
2025-03-03T21:35:04.601Z
{“build_git_commit”:“35d800d”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.0”,“build_wv_version”:“1.29.0”,“level”:“info”,“msg”:“attempting to join”,“remoteNodes”:[“10.22.122.166:8300”],“time”:“2025-03-03T21:35:04Z”}
2025-03-03T21:35:04.602Z
{“build_git_commit”:“35d800d”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.0”,“build_wv_version”:“1.29.0”,“level”:“info”,“msg”:“attempted to join and failed”,“remoteNode”:“10.22.122.166:8300”,“status”:14,“time”:“2025-03-03T21:35:04Z”}
2025-03-03T21:35:05.603Z
{“build_git_commit”:“35d800d”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.0”,“build_wv_version”:“1.29.0”,“level”:“info”,“msg”:“attempting to join”,“remoteNodes”:[“10.22.122.166:8300”],“time”:“2025-03-03T21:35:05Z”}
2025-03-03T21:35:05.603Z
{“build_git_commit”:“35d800d”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.0”,“build_wv_version”:“1.29.0”,“level”:“info”,“msg”:“attempted to join and failed”,“remoteNode”:“10.22.122.166:8300”,“status”:14,“time”:“2025-03-03T21:35:05Z”}
2025-03-03T21:35:06.603Z
{“build_git_commit”:“35d800d”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.0”,“build_wv_version”:“1.29.0”,“level”:“info”,“msg”:“attempting to join”,“remoteNodes”:[“10.22.122.166:8300”],“time”:“2025-03-03T21:35:06Z”}
2025-03-03T21:35:06.604Z
{“build_git_commit”:“35d800d”,“build_go_version”:“go1.22.12”,“build_image_tag”:“v1.29.0”,“build_wv_version”:“1.29.0”,“level”:“info”,“msg”:“attempted to join and failed”,“remoteNode”:“10.22.122.166:8300”,“status”:14,“time”:“2025-03-03T21:35:06Z”}