Unable to run hybrid search (but nearText works) : Connection refused to ollama

tornadijo · January 5, 2025, 2:26am

Hello, first of all thank you very much for your time.

I am experiencing the error:

localhost:11434/api/embed": dial tcp [::1]:11434: connect: connection refused"

Which is very much asked already on the forum, but I think that my bug is different.

I have ollama and weaviate in docker. (latest versions)
I manage to create a collection without problems to ingest pdf documents:


client = weaviate.connect_to_local()
client.collections.create(
    name=collection_name,
    vectorizer_config=Configure.Vectorizer.text2vec_ollama(
        api_endpoint="http://ollama:11434",
        model="nomic-embed-text",               
    ),
    generative_config=Configure.Generative.ollama(
        api_endpoint="http://ollama:11434",
        model="llama3.2",                                       
    )
)

And I even manage to do nearText or BM25 searchs without errors:

response = questions.query.near_text(query="siniestros", limit=3)
for o in response.objects:
    print(o.properties)
client.close()

{'content': 'Los siniestros se resuelven entre 2 y 12 meses.', 'source': 'procedimientos_siniestros [Grupo Impultec - Genei].pdf'}
{'content': '¿Necesitan peritar la mercancía en caso de siniestro? Las agencias que exigen la peritación son Zeleris, CTT, Seur DPD y UPS. Aunque el resto de agencias también pueden', 'source': 'ayudagenei.pdf'}

But as soon as I switch to hybrid search, to use embeddings
:

response = questions.query.hybrid(query="siniestros", limit=3)

I get the error:

 File "/home/carlos/python311/lib/python3.11/site-packages/weaviate/collections/grpc/query.py", line 805, in __call
    res = await _Retry(4).with_exponential_backoff(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/carlos/python311/lib/python3.11/site-packages/weaviate/collections/grpc/retry.py", line 31, in with_exponential_backoff
    raise e
  File "/home/carlos/python311/lib/python3.11/site-packages/weaviate/collections/grpc/retry.py", line 28, in with_exponential_backoff
    return await f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/carlos/python311/lib/python3.11/site-packages/grpc/aio/_call.py", line 327, in __await__
    raise _create_rpc_error(
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "remote client vectorize: send POST request: Post "http://localhost:11434/api/embed": dial tcp [::1]:11434: connect: connection refused"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2025-01-05T02:58:16.881580242+01:00", grpc_status:2, grpc_message:"remote client vectorize: send **POST request: Post \"http://localhost:11434/api/embed\": dial tcp [::1]:11434: connect: connection refused"}"**
>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/carlos/codigotfm/weaviate/getpdf2.py", line 7, in <module>
    **response = questions.query.hybrid(query="siniestros", limit=3)**
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

My weaviate docker-compose (ollama and weaviate are on the same docker network)

services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.28.2
    ports:
    - 8080:8080
    - 50051:50051
    volumes:
    - weaviate_data:/var/lib/weaviate
    restart: always
    environment:
      OLLAMA_URL: http://ollama:11434
      OLLAMA_MODEL: llama3.2:latest
      OLLAMA_EMBED_MODEL: snowflake-arctic-embed:latest
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'text2vec-ollama'
      ENABLE_MODULES: 'text2vec-ollama,generative-ollama'
      CLUSTER_HOSTNAME: 'node1'
    networks:
      - ollama-weviate
volumes:
  weaviate_data:
networks:
  ollama-weviate:
    name: ollama-weviate
    external: false

Since I have ollama port 11434 exposed to the outside, I can query ollama without any problems:

curl http://localhost:11434/api/embed -d '{
  "model": "nomic-embed-text",
  "input": "Why is the sky blue?"
}'
{"model":"nomic-embed-text","embeddings":[[0.009785417,0.044247437,-0.14055912,0.0012672294,0.032222837,0.10741186,-0.008397134,0.010254115,0.00074357603,-0.035431717,0.033934534,0.062272973,0.102648675,0.08567975,0.023684556,0.033663988,-0.03359353,-0.018589992,0.048080757,-0.027181087,-0.056390814,-0.0436777,0.01647538,-0.035050847,0.063383594,0.043157343,0.03345559...

And i can also run a query to ollama from inside another container in the same network

docker compose exec -ti weaviate sh -c "wget --header=\"Content-Type: application/x-www-form-urlencoded\" --post-data=\$'{\\n  \"model\": \"llama3.2:latest\",\\n  \"prompt\": \"Why is the sky blue?\"\\n}' --output-document - http://ollama:11434/api/generate"
Connecting to ollama:11434 (172.19.0.2:11434)
writing to stdout
{"model":"llama3.2:latest","created_at":"2025-01-05T02:30:24.892773828Z","response":"The","done":false}
{"model":"llama3.2:latest","created_at":"2025-01-05T02:30:24.899061635Z","response":" sky","done":false}
....

And I have several LLMs running well

docker exec ollama ollama list
NAME                             ID              SIZE      MODIFIED    
all-minilm:latest                1b226e2802db    45 MB     2 hours ago    
snowflake-arctic-embed:latest    21ab8b9b0545    669 MB    2 hours ago    
llama3:8b-instruct-q5_1          662158bc9277    6.1 GB    2 days ago     
llama3.2:latest                  a80c4f17acd5    2.0 GB    2 days ago     
dolphin-mixtral:8x7b             4f76c28c0414    26 GB     3 days ago     
mistral:instruct                 f974a74358d6    4.1 GB    3 days ago     
nomic-embed-text:latest          0a109f422b47    274 MB    3 days ago

I don’t know why the error is about a connection to ollama localhost

(http://localhost:11434)

if I specify that it is

http://ollama:11434

Thank you very much for some help!

DudaNogueira · January 6, 2025, 4:03pm

hi @tornadijo !!

Welcome to our community

This has worked for me. Let me know if this helps you:

services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.28.2
    ports:
    - 8080:8080
    - 50051:50051
    volumes:
    - weaviate_data:/var/lib/weaviate
    restart: always
    environment:
      OLLAMA_URL: http://ollama:11434
      OLLAMA_MODEL: llama3.2:latest
      OLLAMA_EMBED_MODEL: snowflake-arctic-embed:latest
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'text2vec-ollama'
      ENABLE_MODULES: 'text2vec-ollama,generative-ollama'
      CLUSTER_HOSTNAME: 'node1'
    networks:
      - ollama-weviate
  ollama:
    image: ollama/ollama
    ports:
      - 11434:11434
    volumes:
      - ollama_data:/root/.ollama
      #- ./entrypoint.sh:/entrypoint.sh
    container_name: ollama
    pull_policy: always
    tty: true
    restart: always
    networks:
      - ollama-weviate    
    #entrypoint: ["/bin/bash", "/entrypoint.sh"]      
volumes:
  weaviate_data:
  ollama_data:
networks:
  ollama-weviate:
    name: ollama-weviate
    external: false

and now the code:

import weaviate
from weaviate import classes as wvc
client = weaviate.connect_to_local()
print(f"Client: {weaviate.__version__}, Server: {client.get_meta().get('version')}")

client.collections.delete("Test")
collection = client.collections.create(
    name="Test",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_ollama(
        api_endpoint="http://ollama:11434",
        model="nomic-embed-text",               
    ),
    generative_config=wvc.config.Configure.Generative.ollama(
        api_endpoint="http://ollama:11434",
        model="llama3.2",                                       
    )
)

collection.data.insert({"text": "Hello World!"})

print(collection.generate.fetch_objects(limit=1, single_prompt="Translate to Spanish: {text}").objects[0].generated)

this ouputs:

‘Hola Mundo!\n\n(Note: The phrase “Hello World!” is often used as a greeting in programming and computer science, but it's also widely recognized as a traditional way of saying “hello” in English. In this context, the translation is essentially the same.)’

tornadijo · January 8, 2025, 9:21pm

Thanks!, it works like a charm!

DudaNogueira · January 23, 2025, 2:58pm

hi @AbhinavKasubojula !!

Can you create a new post and provide the required infos? That helps us to understand your scenario better.

Thanks!

Topic		Replies	Views
Ollama dial tcp [::1]:11434: connect: connection refused Support javascript	3	4748	September 17, 2024
Text2vec ollama embedding error Support	3	721	November 5, 2024
Ollama Embeddings call fails with wrong URL path: /api/embeddings Support integration , python	5	2245	November 6, 2024
Dial tcp 127.0.0.1:11434: connect: connection refused (Local Ollama) Support bug , python	3	2729	May 29, 2024
[Question] Getting Weaviate and Ollama working together running locally Support technical	2	350	February 4, 2025

Unable to run hybrid search (but nearText works) : Connection refused to ollama

Related topics