Disabling RateLimit for OpenAI and other compatible APIs via vLLM

Hi!

I’m using vLLM and their compatible APIs for OpenAI, Cohere, etc.

I ran into an error:

{"build_git_commit":"7cebee0","build_go_version":"go1.24.5","build_image_tag":"v1.32.0","build_wv_version":"1.32.0","headers":{"Content-Type":["application/json"],"Date":["Thu, 17 Jul 2025 15:04:36 GMT"],"Vary":["Accept-Encoding"]},"level":"debug","msg":"rate limit headers are missing or invalid, going to keep using the old values","time":"2025-07-17T15:04:36Z"}

I understand this is related to the RateLimit functionality.

After further exploring the configuration options of the modules and weaviate itself, I found no way to disable it.

Could you please tell me if it is possible to add a possibility to disable this functionality?

Hi @a1ekseev,

Welcome to our community — it’s great to have you with us!

Unfortunately, there’s currently no way to disable rate limit checks.

If the need of more control over rate limiting, I’d recommend submitting a feature request here:

Best regards,
Mohamed Shahin
Weaviate Support Engineer
(Ireland, UTC±00:00/+01:00)