Good morning!
A question, we are batch processing a set of data, we have noticed that after processing the first batch we started to receive this error:
Query call with protocol GRPC search failed with message <AioRpcError of RPC that terminated with:
status = StatusCode.DEADLINE_EXCEEDED
details = "Deadline Exceeded"
debug_error_string = "UNKNOWN:Error received from peer {created_time:"2024-08-15T22:09:21.067965744-04:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
>. [level: ERROR]
Exception in thread Thread-30 (worker_thread):
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/weaviate/collections/grpc/query.py", line 762, in __call
res = await self._connection.grpc_stub.Search(
File "/usr/local/lib/python3.10/site-packages/grpc/aio/_call.py", line 318, in __await__
raise _create_rpc_error(
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
status = StatusCode.DEADLINE_EXCEEDED
details = "Deadline Exceeded"
debug_error_string = "UNKNOWN:Error received from peer {created_time:"2024-08-15T22:09:21.067965744-04:00", grpc_status:4, grpc_message:"Deadline Exceeded"}"
we have tried to change the batch size, but something similar happens to us, after processing a small number of batches they start to appear. what we do is to find for each vector the nearest vectors (query.near_object).
does anybody know if it could be because of the number of vectors? do you know how we can optimize this process?
We suggest using something like this as a base, and start tweaking the batch size and concurrent requests according to the resources you have for you cluster:
with movies.batch.fixed_size(batch_size=20, concurrent_requests=2) as batch:
for i, row in df.iterrows():
obj_body = {
c: row[c] for c in data_columns
}
batch.add_object(
properties=obj_body
)
Thank you for reviewing this case.
I would like to clarify that we use batches mainly to process our vectors, and not to add data to the collection. However, we have faced a problem when searching for nearby vectors, as there is a disconnect in the GRCP communication. Also, I mentioned that we process each batch on a different thread, with the intent to determine if I might be overloading the system by using this connection.
Effectively, we generate the vectors in another independent flow and, during this process we download them, upload them to a collection in order to process them and find the closest one for each vector. I have been monitoring the process and have noticed some warning messages, such as the following:
/usr/local/lib/python3.10/asyncio/selector_events.py:701: ResourceWarning: unclosed transport <_SelectorSocketTransport fd=598 read=idle write=<idle, bufsize=0>>
_warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
ResourceWarning: Enable tracemalloc to get the object allocation traceback