Description
This is a follow up on this post.
I have a collection of 50M with ‘vectorIndexType’: ‘flat’ which I need to query against to fetch the ground truth nearest neighbors so that I can benchmark the recall of another identical collection of 50M but with ‘hnsw’ index.
As the ‘flat’ index is known to be difficult to scale out, but with the suggestions from @etiennedi in this post, I made the following change:
- add the
args:
section in the project values.yaml (everything else remains the same).
weaviate:
args:
- “–host”
- “0.0.0.0”
- “–port”
- “8080”
- “–scheme”
- “http”
- “–config-file”
- “/weaviate-config/conf.yaml”
- –read-timeout=600s
- –write-timeout=60s
resources:
…
- after the helm chart change checked in, by checking the updated pod state:
kubectl describe pod weaviate-0 -n weaviate
I can see the arg list is properly overwritten:
Containers:
weaviate:
Container ID: …
Image: …
Image ID: …
Port: 8080/TCP
Host Port: 0/TCP
Command:
/bin/weaviate
Args:
–host
0.0.0.0
–port
8080
–scheme
http
–config-file
/weaviate-config/conf.yaml
–read-timeout=600s
–write-timeout=60s
State: Running
Started: Mon, 04 Mar 2024 17:47:46 -0800
Ready: True
…
- I also configured the client side read-timeout to 600s:
weaviate_client = weaviate.Client(
url=cfg.weaviate_client_cfg.weaviate_server_url,
auth_client_secret=auth_client_secret,
timeout_config=(30, 600),
)
- However, when I run the following query:
start = time.perf_counter()
response = (
weaviate_client.query
.get("PilotImage_flat", ["image_id"])
.with_near_vector({
"vector": v
})
.with_limit(100)
.with_additional(["distance"])
.do()
)
end = time.perf_counter()
print(f"take {end - start} seconds")
interestingly, the latency is on the boundary of 60 seconds (which is the default read-timeout value both on client side and server side), so if latency < 60s, I can get a proper response, but when latency > 60 s, it throws the following exception:
UnexpectedStatusCodeException: Query was not successful! Unexpected status code: 502, with response body: None.
So the query result above is indeterministic that depends on whether it cross the 60s read-time limit.
Once the above exception throws, from the server side, I found the following error message and hint:
{“buildVersion”:“1.23.7”,“context”:{“kubernetes”:{“container_image_id”:“…”,“container_name”:“weaviate”,“pod_name”:“weaviate-0”,“pod_namespace”:“weaviate”},“log_group”:“weaviate”},“description”:“An I/O timeout occurs when the request takes longer than the specified server-side timeout.”,“error”:“write tcp 10.89.39.134:8080->10.89.7.30:46968: i/o timeout”,“hint”:“Either try increasing the server-side timeout using e.g. ‘–write-timeout=600s’ as a command line flag when starting Weaviate, or try sending a computationally cheaper request, for example by reducing a batch size, reducing a limit, using less complex filters, etc. Note that this error is only thrown if client-side and server-side timeouts are not in sync, more precisely if the client-side timeout is longer than the server side timeout.”,“host”:“ip-10-89-36-81.ec2.internal”,“level”:“ERROR”,“message”:“i/o timeout”,“method”:“POST”,“path”:{“ForceQuery”:false,“Fragment”:“”,“Host”:“”,“OmitHost”:false,“Opaque”:“”,“Path”:“/v1/graphql”,“RawFragment”:“”,“RawPath”:“”,“RawQuery”:“”,“Scheme”:“”,“User”:null},“service”:“weaviate-0”,“source_type”:“kubernetes_logs”,“time”:“2024-03-05T08:11:15Z”}
hint:
Either try increasing the server-side timeout using e.g. ‘–write-timeout=600s’ as a command line flag when starting Weaviate, or try sending a computationally cheaper request, for example by reducing a batch size, reducing a limit, using less complex filters, etc. Note that this error is only thrown if client-side and server-side timeouts are not in sync, more precisely if the client-side timeout is longer than the server side timeout.
since I have configured the read-timeout to 600s
on both client side and server side, why I still see this error and hint above?
Does anyone has any insight on what could go wrong and how to trouble shooting?
Server Setup Information
- Weaviate Server Version: 1.23.7
- Deployment Method: K8S
- Multi Node? Number of Running Nodes: 1
- Client Language and Version: 3.21.0