Timeout values in Wevaiate config

Hi Team,

We are using Python v4 client and weaviate 1.25.3 version deployed on kubernetes cluster.

Following values are set in the values.yaml

- --read-timeout=600s **
** - --write-timeout=600s

Still my generate queries fail with timeout errors after 30 seconds ?

gen_results = tnt_coll.generate.near_text(
** query=“What is the payload for create ?”,**
** grouped_task=“Answer the question: What is the payload for create ?”**
)

WeaviateQueryError: Query call with protocol GRPC search failed with message Deadline Exceeded

Tried setting the timeout values in client conn config as well.

client = weaviate.WeaviateClient(
** connection_params=ConnectionParams.from_params(**
** http_host=“host1”,**
** http_port=2730,**
** http_secure=True,**
** grpc_host=“host1”,**
** grpc_port=2731,**
** grpc_secure=True**
** ),**
** additional_config=AdditionalConfig(**
** timeout=Timeout(query=10) # Values in seconds**
** ),**
** auth_client_secret=weaviate.auth.AuthApiKey("**")
** )

Still the issue exists ?

Also would like to understand more on the memory consumption details. Is there any metrics i can enable in prometheus / grafana to monitor the memory usage ?

Regards,
Adithya

@DudaNogueira Can you please help on this issue ?

Regards,
Adithya

hi @adithya.ch !!

Do you see any outstanding logs on Weaviate side?

Memory should be monitored directly from the host, AFAIK.

Above is the error i see after crossing 30 seconds. It always timesout after 30 seconds. If the query completes within 30 seconds it work fine.

Logs show below details no errors as such

time="2024-07-11T20:32:03Z" level=debug msg=" memberlist: Initiating push/pull sync with: weaviate-2"
time="2024-07-11T20:32:12Z" level=debug msg="observed node wide metrics" action=observe_node_wide_metrics object_count=85 took="75.166µs"
time="2024-07-11T20:32:12Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="8.209µs"
time="2024-07-11T20:32:22Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="11.514µs"
time="2024-07-11T20:32:26Z" level=debug msg=" memberlist: Stream connection"
time="2024-07-11T20:32:32Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="20.779µs"
time="2024-07-11T20:32:33Z" level=debug msg=" memberlist: Initiating push/pull sync with: weaviate-2"
time="2024-07-11T20:32:42Z" level=debug msg="observed node wide metrics" action=observe_node_wide_metrics object_count=85 took="71.498µs"
time="2024-07-11T20:32:42Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="7.704µs"
time="2024-07-11T20:32:52Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="20.091µs"
time="2024-07-11T20:33:02Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="20.64µs"
time="2024-07-11T20:33:03Z" level=debug msg=" memberlist: Initiating push/pull sync with: weaviate-2"
time="2024-07-11T20:33:12Z" level=debug msg="observed node wide metrics" action=observe_node_wide_metrics object_count=85 took="79.493µs"
time="2024-07-11T20:33:12Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="8.692µs"
time="2024-07-11T20:33:16Z" level=debug msg=" memberlist: Stream connection"
time="2024-07-11T20:33:22Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="38.098µs"
time="2024-07-11T20:33:32Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="23.066µs"
time="2024-07-11T20:33:33Z" level=debug msg=" memberlist: Initiating push/pull sync with: weaviate-2"
time="2024-07-11T20:33:42Z" level=debug msg="observed node wide metrics" action=observe_node_wide_metrics object_count=85 took="101.806µs"
time="2024-07-11T20:33:42Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="7.886µs"
time="2024-07-11T20:33:52Z" level=debug msg="observed tenant activity stats" action=observe_tenantactivity took="22.663µs

Hi @DudaNogueira

Any suggestions to fix the issue ? It’s a blocker for our applications running on weaviate deployments !

Regards,
Adithya

Ok, this can mitigate, but a query shouldn’t take that long:

there are more timeout configs:

client = weaviate.connect_to_local(
    port=8080,
    grpc_port=50051,
    additional_config=AdditionalConfig(
        timeout=Timeout(init=30, query=60, insert=120)  # Values in seconds
    )
)

But it would be interesting to understand why it is taking that long :thinking:

Hi @DudaNogueira

Tried setting the query values to 600, still the query was timing out in 30 seconds.

Tried uninstalling weaviate client and installing it again and it works fine now ! Looks like some bug with weaviate client / Python which was installed earlier !

Regards,
Adithya

hi @adithya.ch !

What is the version you are running?
Do you still see this error on latest version?
How many objects do your have indexed?
is it multi tenant?

It’s good to get more readings from your cluster. Check this doc on observability:

Let me know if this helps.

Hi @DudaNogueira ,

I would like to also report that we experience timeout issues with the JS/TS v3 client. Specifically it does not seem that neither the .connectToLocal or the .connectToCustom respect the input timeout values.

We’re running a bit more complex nearText task, which on rare occasions will run longer than 30 seconds.

We’ve tried setting the timeout values very low in order to trigger an earlier exit from the request, but the client remains connection on queries up until the 30 sec mark.

        const timeoutParams: TimeoutParams = {
            query: 60,
            init: 30
        }
        const connectionProps: ConnectToCustomOptions = {
                httpHost: process.env.WEAVIATE_HOST,  // URL only, no http prefix
                httpPort: 8080,
                grpcHost: process.env.WEAVIATE_HOST,
                grpcPort: 50051,
                timeout: timeoutParams // Values in seconds
        }
        const client: WeaviateClient = await weaviate.connectToCustom(
            connectionProps
           )

It’s the same song for both connectToLocal and connectToCustom. Always, the message is “Query call with protocol gRPC failed with message : The operation has been aborted”. There are no additional logs in debug mode on the weaviate side, other than at request reception:

2024-07-17 15:55:20 {"action":"restapi_request","level":"debug","method":"GET","msg":"received HTTP request","time":"2024-07-17T13:55:20Z","url":{"Scheme":"","Opaque":"","User":null,"Host":"","Path":"/v1/meta","RawPath":"","OmitHost":false,"ForceQuery":false,"RawQuery":"","Fragment":"","RawFragment":""}}

Weaviate instance is running locally with image: weaviate:1.25.4 image
Weaviate client version is “weaviate-client”: “^3.0.9”

Otherwise, Weaviate is working beautifully, thank you :ok_hand:

Regards,
Andreas

Hi @DudaNogueira ,
I was wondering if you had time looking in to our issue. Would you rather that we create a new issue?

hi @andreasDroid !!

For those parameters, it will try to init (check GRPC health, for example) for up to 30 seconds, and a query for up to 60 seconds.

connectoToLocal or ConnectoToCustom or any connect are basically “the same”. Under the hood it will fill the corresponding informations for you.

But before that, 30 seconds is a lot for a query :thinking:

I have asked internally about how to actually test the timeout. The code you posted should indeed only raise the query timeout at 60s

Thank you.
Yeah. We’ve been trying to trigger an earlier exit by setting the query timeout to 1 sec - hoping to see an effect. However, that does not change the query behaviour since it runs up until the default 30 seconds. So we’re wondering how we can change the timeout parameter.

I agree, 30 sec is indeed a lot for the query. We’re using a big prompt with open ai module, and at times we have to accept the long duration since streaming response in weaviate is not available.

Alternatively, we could fetch the sources with weaviate and use azure/open ai client for completion.