[Question] Filter out objects by id

Hi there,

We currently evaluate Weaviate as backend for a news-recommendation system.

One requirement is that articles (in total a few millions in the collection) which were already read by readers (about a few 10k per user) shall not be recommended, so we have an exclude list per user (a few millions) for our near_vector() searches. Given, the threads in Query filters - filtering out objects with references - #3 by larryhudson and Is there a way to perform conditional filter WHERE prop NOT ContainsAny (:excludedValues)? - #5 by junbetterway it seems like there is no ContainsNotAny operator yet. So, we wondered if we could model our scenario by having a second collection for our readers which just contain IDs of already consumed news pieces.
We could then potentially filter on a wvc.query.Filter.by_ref_count(link_on=f"idWasRead").equals(0).
However, it seems impractical to manage the references as we would (in our understanding) need to generate from all new articles in the collection references to all user objects in the other collection, since no mapping of reference on a per-name basis of the properties seems doable. Additionally, this model seems to be prone for consistency problems since deleting a user object may lead to an unresolvable cross-reference.
Is there any more elegant way to model our requirements with Weavivate?

Thanks for any hints or clarification in case we missed something.

hi @deichrenner !! Welcome to our community! :hugs:

Sorry for the delay here.

This approach could work.

There should be a concern on speed, as adding a new cross reference and querying it with a lot of parameters add to the compute cost.

I believe the best would be creating the some POCs and trying it out.

One possibility to explore could be tenants. Maybe trying a multitenant collection, for the user, and a common collection for the articles, for example.

I have not played much with this multitenant x single tenant collection, to be honest.

Let me know if this helps, or if possible, share some poc code of what you have in mind for the models so that we can explore this together.

thanks!