[Question] How to Weaviate and NVIDIA Turbocharge Vector Search with GPU Acceleration?

lei · December 9, 2025, 7:39am

For self-hosted users .
I tried to configure it ENABLE_MODULES=text2vec-transformers,generative-openai,generative-nvidia,multi2vec-nvidia,reranker-nvidia,text2vec-nvidia‘
However, when querying, it is still not possible to use GPU.
If you know anything, can you briefly answer my question? My ultimate goal is to use GPU acceleration for hybrid search

DudaNogueira · December 10, 2025, 11:02pm

hi @lei !!

Welcome to our community

Weaviate NVIDIA’s module will consume NVIDIA’s services with the API key you can get at https://build.nvidia.com/

If you want to host both Weaviate and your Model Service locally while leveraging your GPU, you should look into running a service like Ollama locally side by side with Weaviate using docker.

Here you can find more information about it: Docker - Ollama

Note: You can also install ollama locally. If you go that route, you will need to update the base url while creating the collection.

Once you have both Weaviate (DB) and Ollama (Model Service) up and running, you can refer to the ollama module doc: Ollama + Weaviate | Weaviate Documentation in order to create your collection accordingly.

Let me know if this helps!

Happy coding!

Topic		Replies	Views
How to Use different embedding than OpenAI Support	1	697	August 16, 2024
Need help to use my own vectorizer and generative model Support integration , wcs	5	891	July 8, 2024
How to use a custom rerank model in query? Support	1	193	August 21, 2025
What endpoints are required for a custom reranker? Support integration	6	1445	November 14, 2023
Cannot enable “reranker-transformers” module and Use "Ranking" feature Support python	2	412	July 18, 2025

[Question] How to Weaviate and NVIDIA Turbocharge Vector Search with GPU Acceleration?

Related topics