Hi, I have a collection with approximately 55K objects. I am using the multi2vec-clip as vectorizer. Each object contains a property named “collection” which refers to the image collection the object belongs to. Now I want get the total count of each collection. When I do a count for all objects, I get a response instantly. But when I i do a count for each collection it takes a very long time. To reproduce you can use the following GraphQL
{
Aggregate {
Schema_name{
collection{
count
type
topOccurrences{
value
occurs
}
}
}
}
}
This is my schema
schemaConfig = {
'class': schema, # class name for schema config in Weaviate (change it with a custom name for your images)
'vectorizer': 'multi2vec-clip',
'vectorIndexType': 'hnsw',
"moduleConfig": {
"multi2vec-clip": {
"imageFields": [
"image"
],
"textFields": [
"metadata",
"metadata_string",
"title",
"url",
"collection"
],
},
"generative-openai": {
"model": "gpt-3.5-turbo"
},
},
'properties': [
{
'name': 'image_id',
'dataType': ['text']
},
{
'name': 'image',
'dataType': ['blob']
},
{
'name': 'metadata',
'dataType': ['text[]']
},
{
'name': 'metadata_string',
'dataType': ['text']
},
{
'name': 'title',
'dataType': ['text']
},
{
'name': 'url',
'dataType': ['text']
},
{
'name': 'handle',
'dataType': ['text']
},
{
'name': 'collection',
'dataType': ['text']
}
]
}
Can someone shed some light why the performance is slow when trying to aggregate?
Thank you very much