Hi Weaviate Team,
A common use case for our team is inserting and updating many documents (>1K) simultaneously. I wanted to know if batch upsert functionality is offered on Weaviate or if another efficient method exists. Thank you,
Hi Weaviate Team,
A common use case for our team is inserting and updating many documents (>1K) simultaneously. I wanted to know if batch upsert functionality is offered on Weaviate or if another efficient method exists. Thank you,
hi @JK_Rider !!
You can do upserts using batch.
For that you need to use deterministic ids. You can generate those UUIDs from one of your ids. We have some examples here:
Here is a simple example:
import weaviate
client = weaviate.connect_to_local()
from weaviate.classes.query import Filter
from weaviate.util import generate_uuid5
client.collections.delete("Test")
collection = client.collections.create(name="Test")
objects = [
{"reference_id": 1, "content": "this is a first content"},
{"reference_id": 2, "content": "this is a second content"}
]
with collection.batch.dynamic() as batch:
for data_row in objects:
batch.add_object(
properties=data_row,
uuid=generate_uuid5(data_row.get("reference_id"))
)
for o in collection.query.fetch_objects().objects:
print(o.properties)
this will output
{âreference_idâ: 2.0, âcontentâ: âthis is a second contentâ}
{âreference_idâ: 1.0, âcontentâ: âthis is a first contentâ}
now we upsert
objects = [
{"reference_id": 1, "content": "this is NEW a first content"},
{"reference_id": 3, "content": "this is a third content"}
]
with collection.batch.dynamic() as batch:
for data_row in objects:
batch.add_object(
properties=data_row,
uuid=generate_uuid5(data_row.get("reference_id"))
)
for o in collection.query.fetch_objects().objects:
print(o.properties)
will output:
{âcontentâ: âthis is a second contentâ, âreference_idâ: 2.0}
{âreference_idâ: 3.0, âcontentâ: âthis is a third contentâ}
{âreference_idâ: 1.0, âcontentâ: âthis is NEW a first contentâ}
Let me know if this helps!
Thanks!
Awesome, @DudaNogueira, I wasnât sure if that automatically updated or if it would error. Thanks, for the confirmation.