Description
I’m a non-technical user attempting to use Verba with Weaviate to parse and import documents locally. I have explicitly configured my environment to avoid using the Unstructured API by installing the unstructured[local-inference] library in the Dockerfile. However, every attempt to import a document fails with an error that seems to indicate it’s still looking for an API Key.
Here’s the error I’m seeing in the logs:
FileStatus.ERROR | Import failed: Reader Unstructured IO failed with:
No Unstructured API Key detected
The error suggests that the system is expecting an API Key, but my intent is to use local inference exclusively and bypass the need for an API entirely.
Steps I’ve taken so far:
- Configured Docker:
• Updated the Dockerfile to install the necessary libraries for local inference:
• unstructured[local-inference] for local document parsing.
• System dependencies: poppler-utils, tesseract-ocr, and libmagic1.
• Verified that these libraries are installed in the Docker container during the build process.
- Environment Variables:
• Removed references to UNSTRUCTURED_API_KEY to avoid reliance on the API.
• Set DEFAULT_DEPLOYMENT=Local in the .env file to indicate local-only operation.
- Testing:
• Verba runs fine and is accessible at localhost:8000.
• Parsing fails for any document I try to import, whether a .txt or .docx.
Despite these steps, the system still appears to check for an API Key during parsing, which seems inconsistent with a local-only configuration.
Server Setup Information
• Weaviate Server Version: 1.25.10
• Deployment Method: Docker Compose
• Multi Node? Number of Running Nodes: No, single-node setup.
• Client Language and Version: Python, using goldenverba[huggingface] and unstructured[local-inference].
• Multitenancy?: No
Any Additional Information
Docker Compose File:
services:
verba:
build:
context: ./
dockerfile: Dockerfile
ports:
- 8000:8000
environment:
- WEAVIATE_URL_VERBA=http://weaviate:8080
- DEFAULT_DEPLOYMENT=Local
depends_on:
weaviate:
condition: service_healthy
networks:
- verba-network
weaviate:
image: semitechnologies/weaviate:1.25.10
ports:
- 8080:8080
networks:
- verba-network
networks:
verba-network:
driver: bridge
Dockerfile:
FROM python:3.11
RUN apt-get update && apt-get install -y \
poppler-utils \
tesseract-ocr \
libmagic1 \
&& rm -rf /var/lib/apt/lists/*
RUN pip install \
"unstructured[local-inference]" \
"goldenverba[huggingface]"
WORKDIR /Verba
COPY . /Verba
RUN pip install "."
EXPOSE 8000
CMD ["verba", "start", "--port", "8000", "--host", "0.0.0.0"]
Logs from Verba:
INFO: Will watch for changes in these directories: [‘/Verba’]
WARNING: “workers” flag is ignored when reloading is enabled.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO: Application startup complete.
FileStatus.ERROR | Import failed: Reader Unstructured IO failed with:
No Unstructured API Key detected
Steps Taken:
• Confirmed the necessary dependencies in Dockerfile.
• Rebuilt the Docker image with --no-cache.
• Verified that DEFAULT_DEPLOYMENT=Local is set in the .env file to indicate local-only inference.
• Removed any reference to UNSTRUCTURED_API_KEY.
I’m looking for guidance on why the system still expects an API Key despite being configured for local inference. Is there additional setup required to fully enable local inference, or am I missing a step? Any help is appreciated!