Description
I have total of 2.5 millions objects in my weaviate collection where i am querying for each doc_id using the code below.I am facing cpu issues where the instance is shutting down if i do 500 queries in parallel. what all can be done to optimise the query performance for this instance before scaling it to more number of nodes ?
query_property = "parent_text"
response: GenerativeSearchReturnType = await chunks_collection.query.hybrid(
query=query,
vector=vector,
alpha=0.5,
limit=10,
query_properties=[query_property],
return_metadata=wq.MetadataQuery(distance=True, score=True, explain_score=True),
filters=wq.Filter.by_property("doc_id").equal(doc_id)
Server Setup Information
- Weaviate Server Version: weaviate:1.28.4
- Deployment Method: docker
- Multi Node? 1
- Client Language and Version: python Version: 4.10.2
- Multitenancy?: No
Any additional Information
Adding my docker-compose.yaml from ec2 r6i.2xlarge (8vcpu and 64gb ram)
Total data is around 80gb
sudo du -sh /var/lib/docker/volumes/ubuntu_weaviate_data/_data
80G /var/lib/docker/volumes/ubuntu_weaviate_data/_data
services:
weaviate:
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image: cr.weaviate.io/semitechnologies/weaviate:1.28.4
deploy:
resources:
limits:
cpus: '7.0'
ports:
- 8080:8080
- 50051:50051
volumes:
- weaviate_data:/var/lib/weaviate
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
ENABLE_API_BASED_MODULES: 'true'
ENABLE_MODULES: 'text-embedding-3-small,text2vec-ollama,generative-ollama'
CLUSTER_HOSTNAME: 'node1'
LIMIT_RESOURCES: 'true'
LOG_LEVEL: 'info'
GOMAXPROCS: 7
GOMEMLIMIT: '48Gib'
volumes:
weaviate_data: