Use ollama embeddings hosted on a server using weaviate

P_K · December 29, 2024, 10:26am

Description

I have setup ollama on a GCP server and exposed it as url. It is accesible through ollama client with AUTH token. Here is sample code:

from ollama import Client
try:
      MODEL_NAME = "nomic-embed-text:latest"
      client = Client(
        host='https://ollama-inferece-url',
        headers={'Authorization': AUTH_TOKEN}
      )
      text = "Hello, this is a test sentence."
      response = client.embeddings(
        model=MODEL_NAME,
        prompt=text
      )

      # Extract embeddings from response
      embedding = response['embedding']
      print(embedding)
      print(f"Single embedding shape: {len(embedding)}")

except Exception as e:
        print(f"Error generating: {e}")

I want to integrate this ollama embeddings with weaviate. Can you provide some sample code or reference for this?

Server Setup Information

Weaviate Server Version: 1.28.0
Deployment Method:docker
Multi Node? Number of Running Nodes: 1
Client Language and Version: Python 3.8
Multitenancy?: No

DudaNogueira · December 29, 2024, 1:56pm

hi @P_K !!

Welcome back

Check out this recipe:

github.com

weaviate/recipes/blob/main/weaviate-features/generative-search/local_rag_using_ollama_integration_using_embedded.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Local RAG with Ollama and Weaviate\n",
    "## Using Weaviate integration\n",
    "\n",
    "This example shows how to use the text2vec-ollama as well the generative-ollama "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup \n",
    "1. Download and install Ollama for your operating system: https://ollama.com/download\n",
    "2. `pip` install the Python library to generate vector embeddings from the model  with `pip install ollama`. (REST API or JavaScript library also available)"
   ]

This file has been truncated. show original

It will use ollama to create a RAG locally.

Let me know if that helps!

P_K · December 29, 2024, 4:42pm

Thanks DudaNogueira.
I have tried this but this works with local ollama embeddings. With the Ollama embedding model hosted on a server, I need to pass in the URL as well as Auth Token for generating embeddings. I unable to find a way to pass in Auth Token to “text2vec-ollama” vectorizer.

Here is sample curl to access the embedding model from local:

curl https://ollama-inference-url/api/embeddings -H "Authorization: AUTH-TOKEN" -H "Content-Type: application/json" -d '{"model": "nomic-embed-text:latest", "prompt": "Why is sky blue?"}'

DudaNogueira · December 29, 2024, 6:36pm

Oh, I see. I believe the ollama module was built with the assumption that Ollama will not require an api token

I have quickly checked that module code and have not found a way to provide an API. I believe this is an interesting feature request.

I will check on this over the week and make sure we open a feature request if necessary.

Thanks!

Topic		Replies	Views
Ollama Embeddings call fails with wrong URL path: /api/embeddings Support integration , python	5	3012	November 6, 2024
How to Use different embedding than OpenAI Support	1	663	August 16, 2024
Text2vec-ollama examples General	1	828	June 7, 2024
Text2vec ollama embedding error Support	3	1045	November 5, 2024
Use DeepInfra as provider for Embedding Model Support	5	87	October 3, 2025

Use ollama embeddings hosted on a server using weaviate

Description

Server Setup Information

Related topics