Timeout Issue When Inserting Video Data into Weaviate with Multi2Vec-Bind Vectorizer

Description

The issue occurs while attempting to insert video data into the Animals collection in Weaviate using the Python client. The UnexpectedStatusCodeError is thrown with the following message:

Object was not added! Unexpected status code: 500, with response body: {'error': [{'message': 'update vector: send POST request: Post "http://multi2vec-bind:8080/vectorize": context deadline exceeded (Client.Timeout exceeded while awaiting headers)'}]}.

The error indicates a timeout while making a POST request to the vectorizer service (multi2vec-bind) for vectorization during the insert operation.

Code that produces the issue:

animals = client.collections.get("Animals")

source = os.listdir("./source/video/")

for name in source:
    print(f"Adding {name}")

    path = "./source/video/" + name
    item = {
        "name": name,
        "path": path,
        "video": toBase64(path),
        "mediaType": "video"
    }

    # Insert videos one by one
    animals.data.insert(item)

Server Setup Information

  • Weaviate Server Version: 1.23.7
  • Deployment Method: Docker

Any Additional Information

  • Logs: The following error is observed in the Weaviate logs:
    {"action":"requests_total","api":"rest","build_git_commit":"6c571ff","build_go_version":"go1.22.8","class_name":"Animals","error":"update vector: send POST request: Post \"http://multi2vec-bind:8080/vectorize\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)","level":"error","msg":"unexpected error","query_type":"objects","time":"2025-01-14T21:35:11Z"}
    
  • Environment Details:
    docker-compose.yml
---
version: '3.4'
services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: semitechnologies/weaviate:1.23.7
    ports:
    - 8080:8080
    - 50051:50051
    restart: on-failure:0
    depends_on:
      multi2vec-bind:
        condition: service_healthy    
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'multi2vec-bind'
      ENABLE_MODULES: 'multi2vec-bind'
      BIND_INFERENCE_API: 'http://multi2vec-bind:8080'
      CLUSTER_HOSTNAME: 'node1'
  
  multi2vec-bind:
    image: semitechnologies/multi2vec-bind:imagebind
    environment:
      ENABLE_CUDA: '0'
    healthcheck:
      test: wget --no-verbose --tries=3 --spider http://localhost:8080/.well-known/ready || exit 1
      interval: 10s
      retries: 5
      start_period: 15s
      timeout: 3000s
...
  • Payload Details: Videos are processed one by one and converted to Base64 format before being inserted into the Animals collection.

Hi @Hao_Zhang,

Welcome to our community and it’s lovely to have you here.

Timeout issues can stem from various points, but let’s rule out a few things:

Video data is large and more complex than simple text or image data. Consequently, the time required to process and vectorize video files might take a bit. Thus consider increasing the timeout.

It’s also worth considering allocating more resources if possible.

Before making these adjustments, it’s crucial to update your Weaviate instance to version 1.28.2, as you’re currently using version 1.23, which is significantly outdated. There have been many improvements since then, making it challenging to debug on an older version. Upgrading will enhance efficiency, especially since the latest client uses more effective gRPC calls.

Lastly, I would highly recommend using batching

Best regards,
Mohamed Shahin
Weaviate Support Engineer