HIgh Cpu Usage

Rajat_m7 · May 27, 2025, 12:57pm

Description

I have total of 2.5 millions objects in my weaviate collection where i am querying for each doc_id using the code below.I am facing cpu issues where the instance is shutting down if i do 500 queries in parallel. what all can be done to optimise the query performance for this instance before scaling it to more number of nodes ?

query_property = "parent_text"
response: GenerativeSearchReturnType = await chunks_collection.query.hybrid(
                            query=query,
                            vector=vector,
                            alpha=0.5,
                            limit=10,
                            query_properties=[query_property],
                            return_metadata=wq.MetadataQuery(distance=True, score=True, explain_score=True),
                            filters=wq.Filter.by_property("doc_id").equal(doc_id)

Server Setup Information

Weaviate Server Version: weaviate:1.28.4
Deployment Method: docker
Multi Node? 1
Client Language and Version: python Version: 4.10.2
Multitenancy?: No

Any additional Information

Adding my docker-compose.yaml from ec2 r6i.2xlarge (8vcpu and 64gb ram)
Total data is around 80gb

sudo du -sh /var/lib/docker/volumes/ubuntu_weaviate_data/_data
80G     /var/lib/docker/volumes/ubuntu_weaviate_data/_data

services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.28.4
    deploy:
      resources:
        limits:
          cpus: '7.0'
    ports:
    - 8080:8080
    - 50051:50051
    volumes:
    - weaviate_data:/var/lib/weaviate
    restart: on-failure:0
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      ENABLE_API_BASED_MODULES: 'true'
      ENABLE_MODULES: 'text-embedding-3-small,text2vec-ollama,generative-ollama'
      CLUSTER_HOSTNAME: 'node1'
      LIMIT_RESOURCES: 'true'
      LOG_LEVEL: 'info'
      GOMAXPROCS: 7
      GOMEMLIMIT: '48Gib'
volumes:
  weaviate_data:

Mohamed_Shahin · May 27, 2025, 2:27pm

Hi @Rajat_m7

Your CPU overload with 500 parallel queries is expected however you’ll need more nodes as load balancer will help distribute traffic properly. A single node would struggle to handle 500 parallel queries.

Can’t you split queries and run in loops with sequential queries, or at minimum process in smaller batches (50-100 at a time) instead of 500 simultaneously? This prevents resource exhaustion on your single node.

Best use of the client is to implement Singleton Pattern for Client Connection, reused Weaviate client connection across all queries. See this explanation: using the singleton pattern
You can see a simple practical implementation which you can build on top of: https://github.com/Shah91n/WeaviateDB-Docs-Snippets-Python-Client/blob/main/Singleton_Pattern_Python_Weaviate_Client.ipynb

Please make sure to update your client to latest 4.14.4 & Weaviate to 1.30.4 as you are outdated in both.

Best regards,
Mohamed Shahin
Weaviate Support Engineer
(Ireland, GMT/UTC timezone)

Rajat_m7 · May 28, 2025, 12:45pm

Hi @Mohamed_Shahin ,
Hi Mohamed,

Thanks for the suggestions – I’ll try them out.

I also wanted to ask about the ideal storage for multiple nodes. We have roughly 80 GB of data and 2.5 million documents currently on single EC2 which is going to increase

Does S3 offer comparable performance to EBS for this kind of setup?

Thanks
Rajat

Mohamed_Shahin · May 28, 2025, 4:13pm

Hey @Rajat_m7,

Based on what I know so far and learned—and definitely open to more insights if others have deeper expertise—I’d recommend starting with EBS gp3 volumes. Weaviate generally performs well with gp3.

Start with gp3 EBS per node, monitor your actual IOPS and throughput usage, and scale up only if needed.

Also, here are a few resources that might be helpful:

Weaviate Disk Storage Calculator

Source code for the calculator: GitHub - Shah91n/Weaviate-Disk-Storage-Calculator: This app helps you estimate storage requirements for Weaviate vector database based on your data characteristics. You can either calculate estimates from basic parameters or extrapolate from existing measurements.

Weaviate UI Tool (Open Source)

Best regards,
Mohamed Shahin
Weaviate Support Engineer
(Ireland, GMT/UTC timezone)

Topic		Replies	Views
Server specs and setup for production Support	5	1555	February 5, 2024
High Query latency in Weaviate Support	13	458	October 1, 2024
How to support concurrent near_text search queries Support	1	380	February 21, 2024
Support needed for fixing Weaviate performance issues Support python , technical	4	277	October 17, 2024
Weaviate High Usage of CPU Support	13	884	May 15, 2024

HIgh Cpu Usage

Description

Server Setup Information

Any additional Information

Related topics