Hello, we use Weaviate 1.22.11 to store 3 million vectors in a class, the vectorization is done outside of Weaviate with our trained model. The vector search with nearVector is slow, between 5 and 20 seconds.
Could anyone give me some advice on how to improve search performance? Thanks.
Environment: Weaviate on AWS ECS, one task 4 vCPU, 8 GB RAM
Model: 256 dimensions
Weaviate class (some properties may be null):
{
"class": "Test",
"description": "test",
"vectorIndexType": "hnsw",
"vectorIndexConfig": {
"vectorCacheMaxObjects": 24000,
"ef": -1,
"efConstruction": 2000,
"maxConnections": 64
},
"vectorizer": "none",
"properties": [
{"name": "reference", "dataType": ["text"], "description": "ID", "tokenization": "field", "indexSearchable": true, "indexFilterable": false},
{"name": "title", "dataType": ["text"], "description": "doc title", "indexSearchable": true, "indexFilterable": false},
{"name": "display_title", "dataType": ["text"], "description": "display title", "indexSearchable": true, "indexFilterable": false},
{"name": "book_title", "dataType": ["text"], "description": "doc title", "indexSearchable": true, "indexFilterable": false},
{"name": "text", "dataType": ["text"], "description": "text", "indexSearchable": true, "indexFilterable": false},
{"name": "document_type", "dataType": ["text"], "description": "document type", "tokenization": "field", "indexSearchable": true, "indexFilterable": false},
{"name": "document_source", "dataType": ["text"], "description": "document source", "tokenization": "field", "indexSearchable": true, "indexFilterable": false},
{"name": "publisher", "dataType": ["text"], "description": "publisher", "tokenization": "field", "indexSearchable": true, "indexFilterable": false},
{"name": "type_facet", "dataType": ["text[]"], "description": "type facet", "tokenization": "field", "indexSearchable": true, "indexFilterable": false},
{"name": "date_facet", "dataType": ["text"], "description": "date facet", "tokenization": "field", "indexSearchable": true, "indexFilterable": false},
{"name": "sort_date", "dataType": ["text"], "description": "date", "tokenization": "field", "indexSearchable": true, "indexFilterable": false}
]
}
Search result example:
{
"data": {
"Get": {
"Test": [
{
"_additional": {
"certainty": 0.8171625733375549,
"id": "44048721-c4c1-41d9-9e6b-707f74e5ebf8"
},
"book_title": "",
"date_facet": null,
"display_title": "Display test",
"document_source": "doc",
"document_type": "test",
"publisher": "website",
"sort_date": "2014-03-25",
"text": "Why my Weaviate vector search performance is low?",
"title": "It's a test"
}
]
}
}
}
For information, we had a Weaviate 1.19.13 on AWS Kubernetes, 2 nodes 16 vCPU and 32 GB RAM, the vectorIndexConfig is the same, but with differents indexed documents. The vector search was less than 1 second.