Weaviate on railway does not persist to disk

I am hosting weaviate on railway.app

This is the Dockerfile, just a simple base image.

FROM semitechnologies/weaviate:1.28.2

These are env variables of the container I setup through railway.

CLUSTER_HOSTNAME="node1"
DEFAULT_VECTORIZER_MODULE="none"
ENABLE_MODULES="text2vec-openai"
PERSISTENCE_DATA_PATH="/var/lib/weaviate"
QUERY_DEFAULTS_LIMIT="25"
RAILWAY_DEPLOYMENT_DRAINING_SECONDS="30"
RAILWAY_RUN_UID="0"

The issue is that redeployment of the instance will wipe clean the weaviate state - i.e. persistance is not working. It is easily reproducible by

  1. Create a simple example schema
  2. Redeploy
  3. list all collections - it will return empty

Logs from container start:

Starting Container

log level not recognized, defaulting to info

the default vectorizer modules is set to "none", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer

auto schema enabled setting is set to "true"

No resource limits set, weaviate will use all available memory and CPU. To limit resources, set LIMIT_RESOURCES=true

module offload-s3 is enabled

Multiple vector spaces are present, GraphQL Explore and REST API list objects endpoint module include params has been disabled as a result.

open cluster service

starting cloud rpc server ...

starting raft sub-system ...

tcp transport

loading local db

local DB successfully loaded

schema manager loaded

construct a new raft node

initial configuration

raft node constructed

raft init

entering follower state

attempting to join

attempted to join and failed

failed to join cluster

starting cluster bootstrapping

notified peers this node is ready to join as voter

heartbeat timeout reached, starting election

entering candidate state

pre-vote successful, starting election

election won

entering leader state

configured versions

grpc server listening at [::]:50051

current Leader

starting migration from old schema

legacy schema is empty, nothing to migrate

migration from the old schema has been successfully completed

Serving weaviate at http://[::]:8080

node reporting ready, exiting bootstrap process

telemetry started

Logs when redeploy is triggered and container is killed.

Stopping Container
sending signal SIGTERM to container
2025-03-12T05:17:30.244690894Z [INFO] Shutting down...  action="restapi_management" build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:22Z" version="1.28.2"
2025-03-12T05:17:30.244722521Z [INFO] Stopped serving weaviate at http://[::]:8080 action="restapi_management" build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:22Z" version="1.28.2"
2025-03-12T05:17:30.244744207Z [INFO] closing log store ... build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:23Z"
2025-03-12T05:17:30.244759674Z [INFO] telemetry terminated action="telemetry_push" build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" payload="&{MachineID:58134389-7868-4e3a-b271-992e020fa053 Type:TERMINATE Version:1.28.2 NumObjects:0 OS:linux Arch:amd64 UsedModules:[]}" time="2025-03-12T05:17:23Z"
2025-03-12T05:17:30.244768524Z [INFO] closing data store ... build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:23Z"
2025-03-12T05:17:30.244786225Z [INFO] closing raft FSM store ... build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:23Z"
2025-03-12T05:17:30.244788879Z [INFO] closing loaded database ... build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:23Z"
2025-03-12T05:17:30.244804757Z [INFO] closing raft-rpc client ... build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:23Z"
2025-03-12T05:17:30.244812407Z [INFO] shutting down raft sub-system ... build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:23Z"
2025-03-12T05:17:30.244820771Z [INFO] closing raft-rpc server ... build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:23Z"
2025-03-12T05:17:30.244837607Z [INFO] transferring leadership to another server build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:23Z"
2025-03-12T05:17:30.244862071Z [ERRO] transferring leadership build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" error="cannot find peer" time="2025-03-12T05:17:23Z"
2025-03-12T05:17:30.244888021Z [INFO] closing raft-net ... build_git_commit="5a3991d" build_go_version="go1.22.10" build_image_tag="v1.28.2" build_wv_version="1.28.2" time="2025-03-12T05:17:23Z"
Stopping Container

Proof that volume mount path matches ENV persistance path variable.

It’s not obvious to me what is going on, I’d appreciate help.

Hi @thepartyparrot !!

Have you attached a volume?

I was able to persist:

Let me know if this helps!

Thanks!

It did work out! For some reason had to create a new service as it wasn’t mounting volume properly, immediately worked!

Another Issue I have is that none of the gRPC quories work, as it is unable to establish connection.

I have setup a custom domain on railway to proxy port 50051.

The gRPC connection gets closed, this is from weaviate client logs (ips redacted):

fastapi.exceptions.HTTPException: 500: Error in batch storage operation: Query call with protocol GRPC batch failed with message <AioRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNAVAILABLE:  Socket closed"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"failed to connect to all addresses; last error: UNAVAILABLE: i: Socket closed", grpc_status:14, created_time:"2025-03-13T20:56:22.813025-07:00"}"

This happens trying to run insert_many - my assumption this would happen on any operation, which uses gRPC.

Any idea about that @DudaNogueira?

Yes. It seems, at least for the UI, it is not possible to bind two ports to a single domain, or two domains, each for one port.

it seems to have a way to deploy using code and then you would be able to bind multiple ports, but that would require more experimentation with railways.app

Weaviate requires 2 ports open, one for REST (usually 8080) and one for GRPC (50051).

What I did was spin up a jupyter notebook instance and then connected from there using their internal hostname.

Let me know if this helps!