Hi there,
We currently evaluate Weaviate as backend for a news-recommendation system.
One requirement is that articles (in total a few millions in the collection) which were already read by readers (about a few 10k per user) shall not be recommended, so we have an exclude list per user (a few millions) for our near_vector()
searches. Given, the threads in Query filters - filtering out objects with references - #3 by larryhudson and Is there a way to perform conditional filter WHERE prop NOT ContainsAny (:excludedValues)? - #5 by junbetterway it seems like there is no ContainsNotAny
operator yet. So, we wondered if we could model our scenario by having a second collection for our readers which just contain IDs of already consumed news pieces.
We could then potentially filter on a wvc.query.Filter.by_ref_count(link_on=f"idWasRead").equals(0)
.
However, it seems impractical to manage the references as we would (in our understanding) need to generate from all new articles in the collection references to all user objects in the other collection, since no mapping of reference on a per-name basis of the properties seems doable. Additionally, this model seems to be prone for consistency problems since deleting a user object may lead to an unresolvable cross-reference.
Is there any more elegant way to model our requirements with Weavivate?
Thanks for any hints or clarification in case we missed something.