I’m experiencing a context deadline exceeded error on generative queries to an Ollama module. The query consistently fails after ~51 seconds, despite timeouts being set much higher.
Environment:
Weaviate Server: v1.31.0 (in Docker)
Modules: text2vec-ollama, generative-ollama
Model: phi4 (on a CPU-only host)
Problem Summary: The error occurs when Weaviate’s generative-ollama module calls the Ollama API: ...send POST request: Post ".../api/generate": context deadline exceeded...
I have already configured the following timeouts:
- Server-Side: Set
EXTENSIONS_CLIENT_TIMEOUT: '1000s'in Weaviate’s Docker environment (confirmed active withdocker inspect). - Client-Side: Set the gRPC query timeout to
600sin the Python client (additional_config).
Crucially, a direct curl request to the Ollama endpoint succeeds, but takes over 5 minutes to complete.
Core Question: Since the query fails at ~51s, it seems another, lower timeout is taking precedence over both my client and server configurations. Is there a different, non-obvious timeout setting specific to the generative search module that I need to be aware of?
Thanks for your help.