We are currently ingesting multiple documents from 3-4 sources into a single Class. We would like users of our RAG application to choose which of the sources to use for grounding answers.
Our proposed solution would be to use filters at query time to get relevant chunks. We worry that having a single Class, with nightly delta loads to update the chunks might be too instable. (We’ve had issues with delete operations corrupting a Class)
Is there a best practice, for this particular setup?
Regarding your question, you could treat each document as a tenant, or a separate collection, however, it wouldn’t be possible to let the user to select for more than one of those documents to perform a query.
With that said, having a property to filter on document is usually the best approach.
We are currently using version 1.25.4, and has since had a “corruption” issue again. Our delta load runs every hour, and does a “delete_many” operation from the Python library (V. 4.5.2). Seemingly randomly the collection becomes corrupted, and will return
“Query call with protocol GRPC search failed with message explorer: get class: vector search: object vector search at index xyz: shard xyz_6vkltMpybcdF: vector search: entrypoint was deleted in the object store, it has been flagged for cleanup and should be fixed in the next cleanup cycle”
If i check the logs in k8s i can also find this operation;
{“action”:“hybrid”,“error”:“explorer: get class: vector search: object vector search at index xyz: remote shard 6vkltMpybcdF: status code: 500, error: shard xyz_6vkltMpybcdF: vector search: entrypoint was deleted in the object store, it has been flagged for cleanup and should be fixed in the next cleanup cycle\n: context deadline exceeded”,“level”:“error”,“msg”:“denseSearch failed”,“time”:“2024-07-05T11:13:40Z”}
This happened friday, and if i try to query the collection now it will still throw the same error.
What do you propose here for settings with regards to tombstone? We are okay with deleting often if it means we can avoid this “corruption” issue
I just checked again and we are actually using version 1.25.6 where this problem occur. What kind of value would you recommend for TOMBSTONE_DELETION_MAX_PER_CYCLE? Its a fairly small cluster (for now) so we only have ~ 10k object shards on each node.
Maximum number of tombstones to delete per cleanup cycle. Set this to limit cleanup cycles, as they are resource-intensive. As an example, set a maximum of 10000000 (10M) for a cluster with 300 million-object shards. (Default: none)
10k objects isn’t that much to consume significant resources to degrade performance.
That sounds good @DudaNogueira. Do you have any other idea why we end up with the corrupted collection/shard then? The only fix is essentially to delete the collection and recreate it - but eventually it happens again.
That sounds great @DudaNogueira. We are at 1.25.6 currently (helm version 17.1.0). I would like to upgrade to 1.25.7, but there are not helmchart with that version yet.