What is the best approach to avoid creating an object which has identical contents?
Experimenting with a simple data.insert will happily add a new object (and will assign it a diffferent UUID of course).
Do I need to first attempt a fetch and only if it fails create?
Can I use a string with the weaviate.util.generate_uuid5 function to generate a deterministic UUID and use that to quickly see if I already have that value in the collection?
Thanks for any suggestion
Thank you
hi @rjalex !
That’s right. If you have an id for that object, you can pass over a generated UUID based on that ID (or unique string) so it can generate deterministic IDS.
Here we have a doc on that:
Let me know if this helps!
Thanks!
1 Like
Thanks @DudaNogueira so the best way to avoid inserting duplicates is to use a deterministic UUID and first fetch then insert, right?
hi @rjalex !
If you provide an existing UUID while ingesting an object, Weaviate will update it for you.
So no need to fetch it first, unless you don’t want the object updated (considering you have a different property value).
A fairly common mistake I have seen is when a user wrongfully passes the same text to generate the UUID while ingesting (like a wrong variable, for example) for all objects, resulting in the same UUID.
On that scenario, after the import, your collection will have only one object
So basically, the first object gets created, and the subsequent ones are only updating that same object, as it is passing the same UUID
1 Like
That is awesome and simplifies inserting a lot. Thanks