client = weaviate.connect_to_local(host=WEAVIATE_HOST, port=WEAVIATE_PORT,
grpc_port=WEAVIATE_SECOND_PORT, headers=w_headers,
auth_credentials=weaviate_auth_credentials,
skip_init_checks=True,
additional_config=AdditionalConfig(
connection=ConnectionConfig(
session_pool_connections=30,
session_pool_maxsize=200,
session_pool_max_retries=3,
),
timeout = (60, 180)
,)
)
weaviate_class_name = email_to_weaviate_class_name(email)
if is_collection_available(weaviate_client=client, collection_name=weaviate_class_name):
collection = client.collections.get(weaviate_class_name)
print(f"started updating {email}")
for item in collection.iterator():
properties = item.properties
properties["data_origin"] = "native_data"
collection.data.update(uuid=item.uuid, properties=properties)
print("data updated")
Hi @Yogesh_Chauhan !!
Welcome to our community
This operation can be very costly, depending on your dataset, and it can be putting the cluster under stress.
It not only update the property, but (if properly configured) trigger a re vectorization of you object, and reindex this object with the new content.
If all your properties have changed, it is best to reindex the entire collection using batch instead of updating it one by one.
Or, if you know the object ids that you need to update beforehand, you can leverage the batch process, and make sure to pass the same ID, like documented here on using deterministic ids.
Let me know if those can be solutions for your.
Thanks!