Hi Weaviate Team,
A common use case for our team is inserting and updating many documents (>1K) simultaneously. I wanted to know if batch upsert functionality is offered on Weaviate or if another efficient method exists. Thank you,
Hi Weaviate Team,
A common use case for our team is inserting and updating many documents (>1K) simultaneously. I wanted to know if batch upsert functionality is offered on Weaviate or if another efficient method exists. Thank you,
hi @JK_Rider !!
You can do upserts using batch.
For that you need to use deterministic ids. You can generate those UUIDs from one of your ids. We have some examples here:
Here is a simple example:
import weaviate
client = weaviate.connect_to_local()
from weaviate.classes.query import Filter
from weaviate.util import generate_uuid5
client.collections.delete("Test")
collection = client.collections.create(name="Test")
objects = [
{"reference_id": 1, "content": "this is a first content"},
{"reference_id": 2, "content": "this is a second content"}
]
with collection.batch.dynamic() as batch:
for data_row in objects:
batch.add_object(
properties=data_row,
uuid=generate_uuid5(data_row.get("reference_id"))
)
for o in collection.query.fetch_objects().objects:
print(o.properties)
this will output
{āreference_idā: 2.0, ācontentā: āthis is a second contentā}
{āreference_idā: 1.0, ācontentā: āthis is a first contentā}
now we upsert
objects = [
{"reference_id": 1, "content": "this is NEW a first content"},
{"reference_id": 3, "content": "this is a third content"}
]
with collection.batch.dynamic() as batch:
for data_row in objects:
batch.add_object(
properties=data_row,
uuid=generate_uuid5(data_row.get("reference_id"))
)
for o in collection.query.fetch_objects().objects:
print(o.properties)
will output:
{ācontentā: āthis is a second contentā, āreference_idā: 2.0}
{āreference_idā: 3.0, ācontentā: āthis is a third contentā}
{āreference_idā: 1.0, ācontentā: āthis is NEW a first contentā}
Let me know if this helps!
Thanks!
Awesome, @DudaNogueira, I wasnāt sure if that automatically updated or if it would error. Thanks, for the confirmation.