Description
Running windows subsystem for linux (WSL2) with docker desktop running the containerization show from windows. I have ollama started with a model, works just fine when testing it with ollama run llama3.1
.
I cook it up with:
docker run -d --gpus=all --name ollama --restart always -v ollama:/root/.ollama --add-host=host.docker.internal:host-gateway -p 11434:11434 ollama/ollama:0.3.10
my docker compose has env vars set to look at my .env file:
OLLAMA_URL=http://host.docker.internal:11434/
OLLAMA_MODEL=llama3.1:latest
OLLAMA_EMBED_MODEL=llama3.1
This works as expected, as I can start up Verba on port 8000 and select docker deployment in the UI. The āChatā tab page has ā0 documents embedded by llama3.1:latestā so itās definitely connecting and reading the right model, else this would show a connection error.
But going onto the āImport Dataā tab and trying to add and import a simple txt file containing āWhy is the sky blueā, throws up:
ā No documents imported 0 of 1 succesful tasks
ā¹ FileStatus.ERROR | why_oh_why.txt | Import for why_oh_why.txt failed:
Import for why_oh_why.txt failed: Batch vectorization failed: Vectorization
failed for some batches: 404, message='Not Found',
url=URL('http://host.docker.internal:11434/api/embed') | 0
I even tried adding ollama to the same network as docker compose (docker network connect verba_default ollama
) and got to the same point, but with āhttp://ollama:11434/api/embedā failing in the same way.
I jumped into the code to start debugging the OllamaEmbedder:
async def vectorize(self, config: dict, content: list[str]) -> list[float]:
model = config.get("Model").value
data = {"model": model, "input": content}
async def on_request_end(session, trace_config_ctx, params):
print(f"Ending request:\n method: {params.method}\n url: {params.url}\n headers: {params.headers}")
trace_config = aiohttp.TraceConfig()
trace_config.on_request_end.append(on_request_end)
async with aiohttp.ClientSession(trace_configs=[trace_config]) as session:
async with session.post(self.url + "/api/embed", json=data) as response:
response.raise_for_status()
data = await response.json()
embeddings = data.get("embeddings", [])
return embeddings
And I was confused as can be on the printout changing the Method to GET:
Ending request:
method: GET
url: http://host.docker.internal:11434/api/embed
headers: <CIMultiDict()>
But maybe thatās down to my poor understanding of Python and these async libraries / middleware changing things as it goes through?
Either way when I use curl from the verba-verba-1 container Iām able to get the embeddings just fine:
curl http://host.docker.internal:11434/api/embed -d '{"model": "llama3.1","input": "Why is the sky blue?"}'
So, now Iām at a loss on what else to try. Any ideas?
Server Setup Information
- Vera commit: 59a46d06e382dc88cc90d9d217e7c5a2a8f950dc
- Deployment Method: local docker compose
- OS: Windows + WSL2