Weaviate with Traefik and gRPC

Hello everyone!. I was trying out weaviate and wanted to implement a reverse proxy. Currently weaviate is hosted on docker that runs with traefik. I was able to connect to weaviate’s REST API without any issues. However, what i was unable to do is proxy the gRPC connection.

Here is the error is see on my debugger:

Exception has occurred: WeaviateQueryException
Query call failed with message Stream removed.
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "Stream removed"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-01-14T21:04:14.775200504-05:00", grpc_status:2, grpc_message:"Stream removed"}"
>

During handling of the above exception, another exception occurred:

  File "/mnt/h/src/server/database/connect.py", line 36, in <module>
    response = questions.generate.near_text(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
weaviate.exceptions.WeaviateQueryException: Query call failed with message Stream removed.

The console had another error (most likely thrown right before the debugger caught the above error:

E0114 21:04:14.765458535  185327 hpack_parser.cc:999]                  Error parsing 'content-type' metadata: invalid value

Till now i have tried the following configurations with traefik.

  1. Creating a grpc endpoint and directly proxying it to port 50051. Did not work.
  2. Creating a H2C endpoint by following this example. Did not work
  3. Creating an https termination on port 443 and proxying it to port 50051. Did not work

My current docker compose(swarm stack) file is like so:

version: '3.8'

services:
  weaviate:
    image: semitechnologies/weaviate:1.23.2
    command:
      - --host
      - 0.0.0.0
      - --port
      - '8080'
      - --scheme
      - http
    ports:
      - "8080:8080"
      - "50051:50051"
    volumes:
      - nfs:/var/lib/weaviate
    networks:
      - public
    deploy:
      replicas: 1
      placement:
        constraints: [node.role != manager]
      labels:
        - "traefik.enable=true"
        - "traefik.http.routers.weaviate.rule=Host(`weaviate.mydomain.com`)"
        - "traefik.http.routers.weaviate.entrypoints=websecure"
        - "traefik.http.routers.weaviate.tls=true"
        - "traefik.http.routers.weaviate.tls.certresolver=ssl_resolver"
        - "traefik.http.routers.weaviate.tls.domains[0].main=weaviate.mydomain.com"
        - "traefik.http.routers.weaviate.service=weaviate"
        - "traefik.http.services.weaviate.loadbalancer.server.port=8080"
        - "traefik.http.routers.weaviategrpc.rule=Host(`grpc.weaviate.mydomain.com`)"
        - "traefik.http.routers.weaviategrpc.entrypoints=web"
#        - "traefik.http.routers.weaviategrpc.tls=true"
#        - "traefik.http.routers.weaviategrpc.tls.certresolver=ssl_resolver"
        - "traefik.http.routers.weaviategrpc.tls.domains[0].main=grpc.weaviate.mydomain.com"
        - "traefik.http.routers.weaviategrpc.service=weaviate"
        - "traefik.http.services.weaviategrpc.loadbalancer.server.port=50051"
        - "traefik.http.services.weaviategrpc.loadbalancer.server.scheme=h2c"

    environment:
      OPENAI_APIKEY: "${OPENAI_APIKEY}"
      QUERY_DEFAULTS_LIMIT: "${QUERY_DEFAULTS_LIMIT}"
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "${AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED}"
      PERSISTENCE_DATA_PATH: "${PERSISTENCE_DATA_PATH}"
      DEFAULT_VECTORIZER_MODULE: "${DEFAULT_VECTORIZER_MODULE}"
      ENABLE_MODULES: "${ENABLE_MODULES}"
      CLUSTER_HOSTNAME: "${CLUSTER_HOSTNAME}"
      LOG_LEVEL: "${LOG_LEVEL}"

volumes:
  nfs:
    driver_opts:
      type: "nfs"
      o: "addr=192.168.4.2,nfsvers=4,nolock,soft,rw"
      device: ":/mnt/pool/docker_swarm/weaviate"

networks:
  public:
    external: true

NOTE: Local connections WORK as expected. This is a configuration issue.

Hi! This is an interesting thing to have documented.

I have created awareness internally, and will try to setup an environment next week.

Thanks!

Hi @qnlbnsl ! Sorry for the delay here.

Looks like I was finally able to tame this

Here is what I got:

---
version: '3.4'
services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: semitechnologies/weaviate:1.23.5
    #ports:
    # - 8081:8080 # unsafe http
    # - 50052:50051 # unsafe grpc
    volumes:
    - weaviate_data:/var/lib/weaviate
    restart: on-failure:0
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'none'
      ENABLE_MODULES: 'text2vec-cohere,text2vec-huggingface,text2vec-palm,text2vec-openai,generative-openai,generative-cohere,generative-palm,ref2vec-centroid,reranker-cohere,qna-openai'
      CLUSTER_HOSTNAME: 'node1'
    labels:
      - "traefik.enable=true"
      # http
      - "traefik.http.services.weaviate_http_service.loadbalancer.server.port=8080"
      - "traefik.http.routers.weaviate_http_router.rule=Host(`weaviate.mydomain.com`)"
      - "traefik.http.routers.weaviate_http_router.entrypoints=websecure"
      - "traefik.http.routers.weaviate_http_router.service=weaviate_http_service"
      - "traefik.http.routers.weaviate_http_router.tls.certresolver=myresolver"
      # grpc
      - "traefik.http.services.weaviate_grpc_service.loadbalancer.server.scheme=h2c"
      - "traefik.http.services.weaviate_grpc_service.loadbalancer.server.port=50051"
      - "traefik.http.routers.weaviate_grpc_router.rule=Host(`grpc.weaviate.mydomain.com`)"
      - "traefik.http.routers.weaviate_grpc_router.entrypoints=grpc"
      - "traefik.http.routers.weaviate_grpc_router.service=weaviate_grpc_service"
      - "traefik.http.routers.weaviate_grpc_router.tls.certresolver=myresolver"
  
  traefik:
    image: "traefik:v2.11"
    container_name: "traefik"
    command:
      - "--log.level=DEBUG"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--entrypoints.web.http.redirections.entryPoint.to=websecure"
      - "--entrypoints.web.http.redirections.entryPoint.scheme=https"
      - "--entrypoints.grpc.address=:50051"
      - "--providers.docker"
      - "--api"
      # - "--certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
      - "--certificatesresolvers.myresolver.acme.tlschallenge=true"
      - "--certificatesresolvers.myresolver.acme.httpchallenge.entrypoint=web"
      - "--certificatesresolvers.myresolver.acme.email=your@mydomain.com"
      - "--certificatesresolvers.myresolver.acme.storage=/letsencrypt/acme.json"

    ports:
      - "80:80"
      - "443:443"
      - "50051:50051"
    volumes:
      - "./letsencrypt:/letsencrypt"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"

volumes:
  weaviate_data:
...

and her how you can test your connections:

# it should be listening http in port 80, redirecting to 443
❯ curl http://weaviate.mydomain.com/v1/nodes
Moved Permanently%

# -L will follow redirects
❯ curl -L http://weaviate.mydomain.com/v1/nodes
{"nodes":[{"batchStats":{"queueLength":0,"ratePerSecond":0},"gitHash":"6aeae65","name":"node1","shards":null,"stats":{"objectCount":0,"shardCount":0},"status":"HEALTHY","version":"1.23.5"}]}

# also directly in https:
❯ curl https://weaviate. mydomain.com/v1/nodes
{"nodes":[{"batchStats":{"queueLength":0,"ratePerSecond":0},"gitHash":"6aeae65","name":"node1","shards":null,"stats":{"objectCount":1,"shardCount":1},"status":"HEALTHY","version":"1.23.5"}]}

# lets test our grpc connection
❯ wget https://raw.githubusercontent.com/grpc/grpc/master/src/proto/grpc/health/v1/health.proto
❯ grpcurl -d '{"service": "Weaviate"}' -proto health.proto grpc.weaviate.mydomain.com:50051 grpc.health.v1.Health/Check
{
  "status": "SERVING"
}

Let me know if that helps :slight_smile: