Read-timeout configuration on client side and server side

fairymane · March 5, 2024, 10:55am

Description

This is a follow up on this post.

I have a collection of 50M with ‘vectorIndexType’: ‘flat’ which I need to query against to fetch the ground truth nearest neighbors so that I can benchmark the recall of another identical collection of 50M but with ‘hnsw’ index.

As the ‘flat’ index is known to be difficult to scale out, but with the suggestions from @etiennedi in this post, I made the following change:

add the args: section in the project values.yaml (everything else remains the same).

weaviate:
args:

“–host”

“0.0.0.0”

“–port”

“8080”

“–scheme”

“http”

“–config-file”

“/weaviate-config/conf.yaml”

–read-timeout=600s

–write-timeout=60s
resources:
…

after the helm chart change checked in, by checking the updated pod state:
kubectl describe pod weaviate-0 -n weaviate

I can see the arg list is properly overwritten:

Containers:
weaviate:
Container ID: …
Image: …
Image ID: …
Port: 8080/TCP
Host Port: 0/TCP
Command:
/bin/weaviate
Args:
–host
0.0.0.0
–port
8080
–scheme
http
–config-file
/weaviate-config/conf.yaml
–read-timeout=600s
–write-timeout=60s
State: Running
Started: Mon, 04 Mar 2024 17:47:46 -0800
Ready: True
…

I also configured the client side read-timeout to 600s:

weaviate_client = weaviate.Client(
    url=cfg.weaviate_client_cfg.weaviate_server_url,
    auth_client_secret=auth_client_secret,  
    timeout_config=(30, 600),
)

However, when I run the following query:

start = time.perf_counter()
response = (
    weaviate_client.query
    .get("PilotImage_flat", ["image_id"])
    .with_near_vector({
        "vector": v
    })
    .with_limit(100)
    .with_additional(["distance"])
    .do()
)

end = time.perf_counter()
print(f"take {end - start} seconds")

interestingly, the latency is on the boundary of 60 seconds (which is the default read-timeout value both on client side and server side), so if latency < 60s, I can get a proper response, but when latency > 60 s, it throws the following exception:

UnexpectedStatusCodeException: Query was not successful! Unexpected status code: 502, with response body: None.

So the query result above is indeterministic that depends on whether it cross the 60s read-time limit.

Once the above exception throws, from the server side, I found the following error message and hint:

{“buildVersion”:“1.23.7”,“context”:{“kubernetes”:{“container_image_id”:“…”,“container_name”:“weaviate”,“pod_name”:“weaviate-0”,“pod_namespace”:“weaviate”},“log_group”:“weaviate”},“description”:“An I/O timeout occurs when the request takes longer than the specified server-side timeout.”,“error”:“write tcp 10.89.39.134:8080->10.89.7.30:46968: i/o timeout”,“hint”:“Either try increasing the server-side timeout using e.g. ‘–write-timeout=600s’ as a command line flag when starting Weaviate, or try sending a computationally cheaper request, for example by reducing a batch size, reducing a limit, using less complex filters, etc. Note that this error is only thrown if client-side and server-side timeouts are not in sync, more precisely if the client-side timeout is longer than the server side timeout.”,“host”:“ip-10-89-36-81.ec2.internal”,“level”:“ERROR”,“message”:“i/o timeout”,“method”:“POST”,“path”:{“ForceQuery”:false,“Fragment”:“”,“Host”:“”,“OmitHost”:false,“Opaque”:“”,“Path”:“/v1/graphql”,“RawFragment”:“”,“RawPath”:“”,“RawQuery”:“”,“Scheme”:“”,“User”:null},“service”:“weaviate-0”,“source_type”:“kubernetes_logs”,“time”:“2024-03-05T08:11:15Z”}

hint:
Either try increasing the server-side timeout using e.g. ‘–write-timeout=600s’ as a command line flag when starting Weaviate, or try sending a computationally cheaper request, for example by reducing a batch size, reducing a limit, using less complex filters, etc. Note that this error is only thrown if client-side and server-side timeouts are not in sync, more precisely if the client-side timeout is longer than the server side timeout.

since I have configured the read-timeout to 600s on both client side and server side, why I still see this error and hint above?

Does anyone has any insight on what could go wrong and how to trouble shooting?

Server Setup Information

Weaviate Server Version: 1.23.7
Deployment Method: K8S
Multi Node? Number of Running Nodes: 1
Client Language and Version: 3.21.0

Any additional Information

DudaNogueira · March 6, 2024, 9:10pm

Hi @fairymane !!

As we discussed in our Office Hour, one path is to try using the latest python v4 client, and try removing the k8s/loadbalancer by exposing it directly to the nodeport.

Another path is try doing this query directly in REST, so we can also remove the client from the stack and consume Weaviate API directly.

Fortunately or Unfortunately, hehehe, this is a very interesting edge case, but I am confident we can figure this out together!

Let me know if this helps!

Thanks!

parkerduckworth · March 7, 2024, 11:21pm

Hey @fairymane, nice chatting with you yesterday. Please do let us know if the above alleviates your issues. Otherwise glad to continue working on this together, as @DudaNogueira mentioned

Topic		Replies	Views
How to configure Weaviate server write timeout in deployment yaml Support	4	992	September 25, 2023
Read Timeout for Class with large count of docs Support	1	507	June 30, 2023
ReadTimeout error on query Support	6	1183	January 31, 2024
Timeout values in Wevaiate config General python , documentation	11	998	July 22, 2024
Slow query response times Support python	2	129	June 18, 2025

Read-timeout configuration on client side and server side

Description

Server Setup Information

Any additional Information

Related topics