Vectorization failed 404 http://host.docker.internal:11434/api/embed

Hey @DudaNogueira , thanks for the warm welcome!

I found my issue, and it’s painfully straight forward.

I was running it all within docker, I just had verba and weaviate on my docker compose, but the ollama image running independently in it’s own container rather than in the same docker compose file. As I say, this can be done by ensuring the ollama container is put onto the same network as verba. My ollama does have a model in it (Llama3.1), which I’ve yet to benchmark but seems to run slick considering I set it up with GPU acceleration (see docker image for details on how), I’ll test it with larger files now I’ve gotten it working. Lord knows if the embedding works as expected, but considering it’s producing content I can’t see why it wouldn’t.

The solution

But the issue was having a trailing forward slash at the end of my OLLAMA_URL environment variable. I did think it was strange my curl command could hit my endpoint but the service couldn’t? So, after removing it:

- OLLAMA_URL=http://host.docker.internal:11434/
+ OLLAMA_URL=http://host.docker.internal:11434

I had my issue disappear, with even the logs giving the expected POST method:

Ending request:
   method: POST
   url: http://host.docker.internal:11434/api/embed
   headers: <CIMultiDict()>

Which to me is still quite infuriating that it changed the method just because the resource wasn’t found? If someone can point to the part where it’s written in the standard that this should happen, then please add it as a reply on here!

Thank you kindly for your prompt reply by the way! If you wanted to have more of a play around you could find out about how to set up the ollama image in docker compose with the GPU acceleration, that would absolutely speed things up as it’s practically instantaneous on my machine that runs a NVIDIA GFORCE RTX 3070.

I’m quite new to contributing to open source, but I’d like to prevent anyone else from making such a rookie error that was difficult to debug. Should I open a PR that:

  1. updates the readme with a note to say avoid tailing slashes.
  2. update the config to validate env var URLs and improve error handling so logs show when something is amiss.
  3. update the config to just strip any tailing slashes automagically.

Would like to know your thoughts?

Kind regards!