Weaviate.connect_to_embedded broken in 4.7.1

Hi community!!

Doing client = weaviate.connect_to_embedded() with weaviate-client 4.7.1 (WEAVIATE_VERSION = “1.26.1”) fails :/.

Context
I’m trying to migrate from 1.24.9 to 1.26.1 after migrating from v3 to v4 because the queries to the cluster got x10 slower.

  1. First I had v3 with 1.24.9 and I had 70 QPS doing a .near_object+filters query.
  2. I migrated to v4 with 1.24.9, and QPS dropped to 7 for the same queries.
  3. found out that after removing filters (over text fields), v4 1.24.9 QPS was 140.
  4. I tried migrating to @latest (1.26.1) because I read many performance improvements were done regarding BM25.

But when trying a basic example, it shuts down and even kills the server.

import weaviate

client = weaviate.connect_to_embedded()

output:

{"action":"startup","default_vectorizer_module":"none","level":"info","msg":"the default vectorizer modules is set to \"none\", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer","time":"2024-08-22T18:45:39-03:00"}
{"action":"startup","auto_schema_enabled":true,"level":"info","msg":"auto schema enabled setting is set to \"true\"","time":"2024-08-22T18:45:39-03:00"}
{"level":"info","msg":"No resource limits set, weaviate will use all available memory and CPU. To limit resources, set LIMIT_RESOURCES=true","time":"2024-08-22T18:45:39-03:00"}
{"level":"info","msg":"module offload-s3 is enabled","time":"2024-08-22T18:45:39-03:00"}
{"level":"warning","msg":"Multiple vector spaces are present, GraphQL Explore and REST API list objects endpoint module include params has been disabled as a result.","time":"2024-08-22T18:45:39-03:00"}
{"level":"info","msg":"open cluster service","servers":{"Embedded_at_8079":61323},"time":"2024-08-22T18:45:39-03:00"}
{"address":"10.195.97.66:61324","level":"info","msg":"starting cloud rpc server ...","time":"2024-08-22T18:45:39-03:00"}
{"level":"info","msg":"starting raft sub-system ...","time":"2024-08-22T18:45:39-03:00"}
{"address":"10.195.97.66:61323","level":"info","msg":"tcp transport","tcpMaxPool":3,"tcpTimeout":10000000000,"time":"2024-08-22T18:45:39-03:00"}
{"level":"info","msg":"loading local db","time":"2024-08-22T18:45:39-03:00"}
{"level":"info","msg":"local DB successfully loaded","time":"2024-08-22T18:45:39-03:00"}
{"level":"info","msg":"schema manager loaded","n":0,"time":"2024-08-22T18:45:39-03:00"}
{"level":"info","metadata_only_voters":false,"msg":"construct a new raft node","name":"Embedded_at_8079","time":"2024-08-22T18:45:39-03:00"}
{"action":"raft","index":0,"level":"info","msg":"raft initial configuration","servers":"[[]]","time":"2024-08-22T18:45:39-03:00"}
{"last_snapshot_index":0,"last_store_applied_index":0,"last_store_log_applied_index":0,"level":"info","msg":"raft node constructed","raft_applied_index":0,"raft_last_index":0,"time":"2024-08-22T18:45:39-03:00"}
{"action":"raft","follower":{},"leader-address":"","leader-id":"","level":"info","msg":"raft entering follower state","time":"2024-08-22T18:45:39-03:00"}
{"action":"bootstrap","error":"could not join a cluster from [10.195.97.66:61323]","level":"warning","msg":"failed to join cluster, will notify next if voter","servers":["10.195.97.66:61323"],"time":"2024-08-22T18:45:41-03:00","voter":true}
{"level":"warning","msg":"raft no known peers, aborting election","time":"2024-08-22T18:45:41-03:00"}
{"docker_image_tag":"unknown","level":"info","msg":"configured versions","server_version":"1.26.1","time":"2024-08-22T18:45:41-03:00"}
{"action":"grpc_startup","level":"info","msg":"grpc server listening at [::]:50050","time":"2024-08-22T18:45:41-03:00"}
{"action":"restapi_management","docker_image_tag":"unknown","level":"info","msg":"Serving weaviate at http://127.0.0.1:8079","time":"2024-08-22T18:45:41-03:00"}
{"action":"bootstrap","error":"rpc error: code = Unavailable desc = connection error: desc = \"error reading server preface: EOF\"","level":"error","msg":"notify all peers","servers":["10.195.97.66:61323"],"time":"2024-08-22T18:45:42-03:00"}
{"action":"bootstrap","error":"could not join a cluster from [10.195.97.66:61323]","level":"warning","msg":"failed to join cluster, will notify next if voter","servers":["10.195.97.66:61323"],"time":"2024-08-22T18:45:42-03:00","voter":true}
{"action":"bootstrap","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: failed to write client preface: write tcp 10.195.97.66:61399-\u003e10.195.97.66:61324: write: broken pipe\"","level":"error","msg":"notify all peers","servers":["10.195.97.66:61323"],"time":"2024-08-22T18:45:43-03:00"}
{"action":"bootstrap","error":"could not join a cluster from [10.195.97.66:61323]","level":"warning","msg":"failed to join cluster, will notify next if voter","servers":["10.195.97.66:61323"],"time":"2024-08-22T18:45:44-03:00","voter":true}
{"action":"bootstrap","error":"rpc error: code = Unavailable desc = connection error: desc = \"error reading server preface: EOF\"","level":"error","msg":"notify all peers","servers":["10.195.97.66:61323"],"time":"2024-08-22T18:45:45-03:00"}
{"action":"bootstrap","error":"could not join a cluster from [10.195.97.66:61323]","level":"warning","msg":"failed to join cluster, will notify next if voter","servers":["10.195.97.66:61323"],"time":"2024-08-22T18:45:45-03:00","voter":true}
{"action":"telemetry_push","level":"info","msg":"telemetry started","payload":"\u0026{MachineID:efb63a4a-79df-428f-8e5d-d94bf7146956 Type:INIT Version:1.26.1 NumObjects:0 OS:darwin Arch:arm64 UsedModules:[]}","time":"2024-08-22T18:45:49-03:00"}
{"action":"restapi_management","docker_image_tag":"unknown","level":"info","msg":"Shutting down... ","time":"2024-08-22T18:45:53-03:00"}
{"action":"restapi_management","docker_image_tag":"unknown","level":"info","msg":"Stopped serving weaviate at http://127.0.0.1:8079","time":"2024-08-22T18:45:53-03:00"}
{"action":"telemetry_push","level":"info","msg":"telemetry terminated","payload":"\u0026{MachineID:efb63a4a-79df-428f-8e5d-d94bf7146956 Type:TERMINATE Version:1.26.1 NumObjects:0 OS:darwin Arch:arm64 UsedModules:[]}","time":"2024-08-22T18:45:53-03:00"}
{"level":"info","msg":"closing raft FSM store ...","time":"2024-08-22T18:45:53-03:00"}
{"level":"info","msg":"shutting down raft sub-system ...","time":"2024-08-22T18:45:53-03:00"}
{"level":"info","msg":"closing raft-net ...","time":"2024-08-22T18:45:53-03:00"}
{"level":"info","msg":"closing log store ...","time":"2024-08-22T18:45:53-03:00"}
{"level":"info","msg":"closing data store ...","time":"2024-08-22T18:45:53-03:00"}
{"level":"info","msg":"closing loaded database ...","time":"2024-08-22T18:45:53-03:00"}
{"level":"info","msg":"closing raft-rpc client ...","time":"2024-08-22T18:45:53-03:00"}
{"level":"info","msg":"closing raft-rpc server ...","time":"2024-08-22T18:45:53-03:00"}

the env:

$ poetry show weaviate-client
 name         : weaviate-client                 
 version      : 4.7.1                           
 description  : A python native Weaviate client 

dependencies
 - authlib >=1.2.1,<2.0.0
 - grpcio >=1.57.0,<2.0.0
 - grpcio-health-checking >=1.57.0,<2.0.0
 - grpcio-tools >=1.57.0,<2.0.0
 - httpx >=0.25.0,<=0.27.0
 - pydantic >=2.5.0,<3.0.0
 - requests >=2.30.0,<3.0.0
 - validators 0.33.0

Am I missing something? Should I downgrade?

Checking the releases I saw 1.24.9 had a big performance regression issue; moving to 1.24.10 did the trick.

Now, though, I’m left wondering. Why doesn’t 1.26.1 work and could it (or any other >1.24.10 weaviate server version) yield performance improvements?