Startup Failure

Here is the log:

 {"action":"startup","level":"debug","msg":"created startup context, nothing done so far","startup_time_left":"59m59.997087554s","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup","default_vectorizer_module":"none","level":"info","msg":"the default vectorizer modules is set to \"none\", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup","auto_schema_enabled":true,"level":"info","msg":"auto schema enabled setting is set to \"true\"","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup","level":"debug","msg":"config loaded","startup_time_left":"59m59.996192293s","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup","level":"debug","msg":"configured OIDC and anonymous access client","startup_time_left":"59m59.996174439s","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup","level":"debug","msg":"initialized schema","startup_time_left":"59m59.996166999s","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup","level":"debug","msg":"startup routine complete","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"level":"info","msg":"Limiting resources:  memory: 80%, cores: all but one","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"cores":47,"level":"warning","msg":"Unable to read from cgroups: read cpuset: open /sys/fs/cgroup/cpuset/cpuset.cpus: no such file or directory, setting to max cores to: 47","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"level":"info","limit":9223372036854775807,"msg":"Set memory limit","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup","level":"debug","msg":"start registering modules","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup","level":"debug","module":"backup-filesystem","msg":"enabled module","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup","level":"debug","msg":"completed registering modules","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"level":"info","msg":"async indexing enabled","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup_cluster_schema_sync","level":"debug","msg":"Only node in the cluster at this point. No schema sync necessary.","time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"cluster_api_startup","level":"debug","msg":"serving cluster api on port 7947","port":7947,"time":"2024-03-14T00:24:58Z"}
weaviate-weaviate-1  | {"action":"startup","error":"create index: init shard ISLkKaw6gwf6 of index season: init shard \"season_ISLkKaw6gwf6\": init shard \"season_ISLkKaw6gwf6\": shard db: create objects bucket: init disk segments: init segment segment-1705570724834516877.db: mmap file: invalid argument","level":"fatal","msg":"db didn't start up","time":"2024-03-14T00:24:58Z"}

Tried with 1.24.1 and 1.24.2 same result - I believe 1.23 also had this issue… When it had been running it told me it had issues with compaction. Currently I run “sudo docker compose up”

Hi!

It seems that it couldn’t initialize a new shard on the disk.

So you ingested the data in 1.23.? and migrated?

can you set the LOGLEVEL to trace so we can see if we have more detailed logs?

This is on log level trace, here is the docker-compose file:


---
version: '3.4'
services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: semitechnologies/weaviate:1.24.2
    ports:
    - 8080:8080
    - 50051:50051
    restart: unless-stopped
    volumes:
      - /home/weaviate/weaviate_data:/var/lib/weaviate
    environment:
      LOG_LEVEL: 'trace'
      ASYNC_INDEXING: 'true'
      LIMIT_RESOURCES: 'true'
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'none'
      CLUSTER_HOSTNAME: 'node1'
      GOMEMLIMIT: '300000MiB'

Ok, with that set, do you see any more logs on startup?

No, there were no new logs

Hi @msj242,

this error init segment segment-1705570724834516877.db: mmap file: invalid argument" may be caused by a corrupted .db file, it could be due to a reboot or unexpected crash (also during compaction) when using a version lower than v1.24 where integrity of files was improved. Was that data created in an older version?

Yes, it was created in lower versions - I kept upgrading along the way (probably for a year). There was indeed a crash at some point, and I assume you’re correct.

Is there a way to recover from the crashed state? Like rebuilding the indexes, since 99% of the data is good and has with it the vectors etc…

I ask, because we have 100 million (384 d) vectors, and took a while to upload, but at this point the database is struggling. It would be useful to lose some failed data and rebuild the database from the good data. Maybe this is not possible, and 1.24 will better prevent the situation, but it would be a useful feature - if possible. Or perhaps it exists and you can point me to a solution.

I hit the same startup failure after a crash.
But the files were created with v1.24.6, so it seems the improved integrity of files introduced in v1.24 did not help in my case.

What is the recovery here?
Manually dropping the corrupted .db file and rebuilding it from zero?