I’m experiencing a context deadline exceeded
error on generative queries to an Ollama module. The query consistently fails after ~51 seconds, despite timeouts being set much higher.
Environment:
Weaviate Server: v1.31.0 (in Docker)
Modules: text2vec-ollama
, generative-ollama
Model: phi4
(on a CPU-only host)
Problem Summary: The error occurs when Weaviate’s generative-ollama
module calls the Ollama API: ...send POST request: Post ".../api/generate": context deadline exceeded...
I have already configured the following timeouts:
- Server-Side: Set
EXTENSIONS_CLIENT_TIMEOUT: '1000s'
in Weaviate’s Docker environment (confirmed active withdocker inspect
). - Client-Side: Set the gRPC query timeout to
600s
in the Python client (additional_config
).
Crucially, a direct curl
request to the Ollama endpoint succeeds, but takes over 5 minutes to complete.
Core Question: Since the query fails at ~51s, it seems another, lower timeout is taking precedence over both my client and server configurations. Is there a different, non-obvious timeout setting specific to the generative search module that I need to be aware of?
Thanks for your help.