Query maximum value of property

I’m working with classes that contain a numeric identifier as a property with data type int, which corresponds with an identifier from another system. I’m using an aggregate query to identify the maximum id so I know where to resume loading data from the source system. The dataset is pretty large (millions) and the more records that get loaded the longer it takes for that maximum to be queried when I try to resume. Increasing the timeout only goes so far … is there a better way for me to query this?

My query looks something like this:

result = (
  client.query
  .aggregate("MyClass")
  .with_fields("my_identifier {maximum}")
  .do()
)

Thanks

Hi @vectorcake,

Yea, aggregate can get a bit slow at times.
What if you tried to sort your data by my_identifier and then limit the results to one object? Like this:

response = (
    client.query
    .get('MyClass', ['my_identifier'])
    .with_sort({
        'path': ['my_identifier'],
        'order': 'desc',
    })
    .with_limit(1)
    .do()
)

Let me know if that works for you.

You can learn more about the Sorting API in Weaviate here

That improved the query performance, but once I got to millions of records it started timing out again. Any other ideas? I’m thinking I might have to store the last inserted ID in an external system and refer back to that …