Weaviate Shutting Down Automatically

Description

We are using weaviate v1.24.2 running using docker in single node. We are observing that, weaviate automatically shuts down after running fine for a couple of days. There is no OOM expected since the machine has sufficient memory available. There is not much available in the logs as well.

Server Setup Information

  • Weaviate Server Version: 1.24.2
  • Deployment Method: docker
  • Multi Node? Number of Running Nodes: No. 1 node.
  • Client Language and Version: Python.
  • Multitenancy: No

Any additional Information

{"action":"telemetry_push","level":"info","msg":"telemetry update","payload":"\u0026{MachineID:c7ea6393-5c1a-4c1a-92ba-ead71cbbfad9 Type:UPDATE Version:1.24.2 Modules:generative-openai,text2vec-openai NumObjects:910511 OS:linux Arch:amd64}","time":"2024-09-18T01:01:34Z"}
{"action":"requests_total","api":"graphql","class_name":"","error":"context canceled","level":"error","msg":"unexpected error","query_type":"","time":"2024-09-18T09:41:25Z"}
{"action":"requests_total","api":"graphql","class_name":"","error":"context canceled","level":"error","msg":"unexpected error","query_type":"","time":"2024-09-18T13:19:36Z"}
{"action":"telemetry_push","level":"info","msg":"telemetry update","payload":"\u0026{MachineID:c7ea6393-5c1a-4c1a-92ba-ead71cbbfad9 Type:UPDATE Version:1.24.2 Modules:generative-openai,text2vec-openai NumObjects:910511 OS:linux Arch:amd64}","time":"2024-09-19T01:01:33Z"}
{"level":"info","msg":"Created shard byskryvdofxoqahloxdzigeskjam_ahqdoowkumnwonnvxxouwmzrdxjs_4SOl1Y7CFpiz in 1.778589ms","time":"2024-09-19T06:25:54Z"}
{"action":"hnsw_vector_cache_prefill","count":1000,"index_id":"main","level":"info","limit":1000000000000,"msg":"prefilled vector cache","time":"2024-09-19T06:25:54Z","took":44132}
{"level":"info","msg":"Created shard bufbvoexajkthedtwfkeidqknppk_bhgckdjhnvpuzuthiacwxzspbkmh_xIGiqOKLpF36 in 1.612693ms","time":"2024-09-19T06:49:24Z"}
{"action":"hnsw_vector_cache_prefill","count":1000,"index_id":"main","level":"info","limit":1000000000000,"msg":"prefilled vector cache","time":"2024-09-19T06:49:24Z","took":43992}
{"action":"restapi_management","level":"info","msg":"Shutting down... ","time":"2024-09-19T20:27:31Z"}
{"action":"restapi_management","level":"info","msg":"Stopped serving weaviate at http://[::]:8080","time":"2024-09-19T20:27:31Z"}
{"action":"telemetry_push","level":"info","msg":"telemetry terminated","payload":"\u0026{MachineID:c7ba8293-5c13-4c1a-94ba-eaf71cbbfad9 Type:TERMINATE Version:1.24.2 Modules:generative-openai,text2vec-openai NumObjects:914984 OS:linux Arch:amd64}","time":"2024-09-19T20:27:32Z"}

The above are the logs i see. Basically I just see this one log line, before it goes down.

{"action":"restapi_management","level":"info","msg":"Shutting down... ","time":"2024-09-19T20:27:31Z"}

Please help me on why this issue is happening? How can I resolve it? or How can I debug further.

hi @jenu9417 !!

Welcome to our community :hugs:

That’s strange. It seems that something external to Weaviate is sending a TERM SIGNAL, as considering the logs you have shared, I see nothing that would make it shut down.

Do you have any health check set up for this container?

Hi @DudaNogueira

No. I haven’t setup any healthcheck. There is no loadbalancer for this as well. Its in a single VM machine.
From our service, before each request we check if weaviate is Alive using the is_live() method, which inturn hits the /.well-known/live endpoint.

I wanted to understand, what all could be possible chances here:
a) Can weaviate be crashing here due to some resource constraints? I have set the config:
LIMIT_RESOURCES to true.
Also I have added restart: on-failure:1 as part of the docker compose file.

b) Can this be induced by a weaviate client as part of a call failure? I’m using python weaviate client.(v3.23.2) I didn’t see any issues as part of my application logs.

c) How can we debug more regarding this?

Thanks.

hi! I don’t think this is something from Weaviate itself, but something on your deployment.

There is no error logs prior the shit down info log, so this is probably docker sending the term signal to this container

Hi @DudaNogueira ,
Sure. Thanks for the update. Will check again internally.

1 Like

Ok, let me know if you have found something!

Thanks!