[Question] Is there a way to filter results by range?

I need to query my entire database, but it’s too large to return an array variable with all objects. So, I was wondering if there were a way to execute a query that will return a range of objects.

So, let’s say my first query starts from 1 and has a limit of 1000. Is there a way to start at 1001 to 2000 and so on?

What about the method proposed to read all objects here?

For Python Client v3 and Go, these code examples do exactly what you are asking for. There might be no need for the batching anymore when using Python Client v4 as here the example suggests just reading individual objects.

2 Likes

I think this is the way to go: Search patterns and basics | Weaviate

I’ve got to store the objects in memory as I retrieve them in order to do some processing on each, so the best way forward for me is to process them in batches – I guess they call their process “pagination”.

hi @SomebodySysop !

I believe this will work just fine.

I did a test, that follows:

from weaviate.classes.config import Configure

client.collections.delete("Test")

collection = client.collections.create(
    "Test",
    vectorizer_config=Configure.Vectorizer.none()
    # Additional parameters not shown
)

for i in range(100):
    collection.data.insert({"text": f"Object {i}"})

for o in collection.query.fetch_objects(limit=10).objects:
    print(o.properties)

for o in collection.query.fetch_objects(limit=10, offset=5).objects:
    print(o.properties)

Now I get:

{‘text’: ‘Object 6’}
{‘text’: ‘Object 4’}
{‘text’: ‘Object 5’}
{‘text’: ‘Object 28’}
{‘text’: ‘Object 56’}
{‘text’: ‘Object 32’}
{‘text’: ‘Object 23’}
{‘text’: ‘Object 97’}
{‘text’: ‘Object 61’}
{‘text’: ‘Object 96’}

and with offset 5:

{‘text’: ‘Object 32’}
{‘text’: ‘Object 23’}
{‘text’: ‘Object 97’}
{‘text’: ‘Object 61’}
{‘text’: ‘Object 96’}
{‘text’: ‘Object 44’}
{‘text’: ‘Object 1’}
{‘text’: ‘Object 29’}
{‘text’: ‘Object 7’}
{‘text’: ‘Object 63’}

And even if I update that content:

for o in collection.query.fetch_objects(limit=100).objects:
    collection.data.update(uuid=o.uuid, properties={"text": o.properties.get("text") + "update"})

Also, the QUERY_MAXIMUM_RESULTS only interference here will be the number of objects returned (limit parameter) so nothing changed

1 Like