Disabling RateLimit for OpenAI and other compatible APIs via vLLM

a1ekseev · July 17, 2025, 4:01pm

Hi!

I’m using vLLM and their compatible APIs for OpenAI, Cohere, etc.

I ran into an error:

{"build_git_commit":"7cebee0","build_go_version":"go1.24.5","build_image_tag":"v1.32.0","build_wv_version":"1.32.0","headers":{"Content-Type":["application/json"],"Date":["Thu, 17 Jul 2025 15:04:36 GMT"],"Vary":["Accept-Encoding"]},"level":"debug","msg":"rate limit headers are missing or invalid, going to keep using the old values","time":"2025-07-17T15:04:36Z"}

I understand this is related to the RateLimit functionality.

After further exploring the configuration options of the modules and weaviate itself, I found no way to disable it.

Could you please tell me if it is possible to add a possibility to disable this functionality?

Mohamed_Shahin · July 23, 2025, 12:43pm

Hi @a1ekseev,

Welcome to our community — it’s great to have you with us!

Unfortunately, there’s currently no way to disable rate limit checks.

If the need of more control over rate limiting, I’d recommend submitting a feature request here:

Best regards,
Mohamed Shahin
Weaviate Support Engineer
(Ireland, UTC±00:00/+01:00)

Topic		Replies	Views
Weaviate Gives Rate Limit Error Support wcs	1	762	January 24, 2024
Weaviate request timing out when using gpt4 Support	2	736	November 4, 2023
How to set the OpenAI API Key in the header of Weaviate Cloud Console? Support	1	659	May 31, 2023
How to use weaviate with LM Studio? Support	3	324	July 24, 2024
Cannot use vec. param. using OpenAI API key via GPT assistant yaml / json schema Support	3	532	November 21, 2023

Disabling RateLimit for OpenAI and other compatible APIs via vLLM

Related topics