Exception: Query call with protocol GRPC batch failed with message recvmsg:Connection reset by peer

Adityam_Ghosh · July 11, 2024, 6:28am

Description

So I have a custom linux home server which I have built and I have deployed a weaviate instance using docker. Now I am transfering around 7M records from my mongo instance (which is also running as a docker instance in my current setup) to weaviate using multithreading.

The thing is after migrating around 3M records the python script crashes with the following error:
Exception: Query call with protocol GRPC batch failed with message recvmsg:Connection reset by peer.

Server Setup Information

Weaviate Server Version:
Deployment Method: docker
Multi Node? Number of Running Nodes: 1
Client Language and Version: Python v4
Multitenancy?: No

Any additional Information

Here’s the complete log:

weaviate.exceptions.WeaviateBatchError: Query call with protocol GRPC batch failed with message recvmsg:Connection reset by peer.
Traceback (most recent call last):
  File "/home/abc/test/datamigration/.venv/lib/python3.12/site-packages/weaviate/collections/batch/grpc_batch_objects.py", line 137, in __send_batch
    res, _ = self._connection.grpc_stub.BatchObjects.with_call(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/abc/test/datamigration/.venv/lib64/python3.12/site-packages/grpc/_channel.py", line 1198, in with_call
    return _end_unary_response_blocking(state, call, True, None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/abc/test/datamigration/.venv/lib64/python3.12/site-packages/grpc/_channel.py", line 1006, in _end_unary_response_blocking
    raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "recvmsg:Connection reset by peer"
	debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-07-11T18:15:30.460550531+12:00", grpc_status:14, grpc_message:"recvmsg:Connection reset by peer"}"
>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/abc/test/datamigration/transfer_mongo2weav.py", line 194, in <module>
    upload2weaviate(
  File "/home/abc/test/datamigration/transfer_mongo2weav.py", line 39, in upload2weaviate
    uuids = weaviate_collection.data.insert_many(data)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/abc/test/datamigration/.venv/lib/python3.12/site-packages/weaviate/collections/data.py", line 410, in insert_many
    return self._batch_grpc.objects(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/abc/test/datamigration/.venv/lib/python3.12/site-packages/weaviate/collections/batch/grpc_batch_objects.py", line 97, in objects
    errors = self.__send_batch(weaviate_objs, timeout=timeout)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/abc/test/datamigration/.venv/lib/python3.12/site-packages/weaviate/collections/batch/grpc_batch_objects.py", line 151, in __send_batch
    raise WeaviateBatchError(e.details())  # pyright: ignore
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
weaviate.exceptions.WeaviateBatchError: Query call with protocol GRPC batch failed with message recvmsg:Connection reset by peer.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/abc/test/datamigration/transfer_mongo2weav.py", line 216, in <module>
    raise Exception(e)
Exception: Query call with protocol GRPC batch failed with message recvmsg:Connection reset by peer.

Mohamed_Shahin · July 11, 2024, 9:45am

Hi @Adityam_Ghosh,

Welcome to our community! It’s great to have you here.

I’ve noticed this issue can occur when there’s latency in the connection.

Can you try adding the skip_init_checks=True flag to your connection call to bypass the initial connection checks? Here’s how you can do it:

import weaviate

client = weaviate.connect_to_local(
    ...
    skip_init_checks=True
)

Initial Connection Checks - If you stop seeing the error, it would likely point to latency issues when checking the port.

Adityam_Ghosh · July 11, 2024, 3:10pm

Thanks, but unfortunately this also didn’t work out. I’m thinking, is it because I am trying to upload the data to weaviate using multithreading? Like since it’s receiving a lot of request, the server isn’t able to process all of these at once?

Mohamed_Shahin · July 12, 2024, 11:50am

Happy Friday @Adityam_Ghosh!

It’s good point! have you considered running multiple nodes setup then like at least 3?

Adityam_Ghosh · July 12, 2024, 1:08pm

Hi @Mohamed_Shahin, thanks for the suggestion. I haven’t thought about this. Will surely try it and post an update regarding this.

Mohamed_Shahin · July 12, 2024, 1:34pm

Awesome, and configure the replication factor to 3:

Also, this batch import best practice may add to the code you have

Let me know how it goes!

Have a good weekend!

Adityam_Ghosh · July 12, 2024, 1:53pm

Thanks mate, you too have a great weekend! Right now, I am trying another configuration where I have just increased the memory limit from 1GB to 4GB and see what happens. And then I will try the solution that you have shared.

00.lope.naughts · October 17, 2024, 3:34pm

I have also seen this. for my case, it seems to have something to do with latency of the network connection (to the free weaviate cloud instance).

client = weaviate.connect_to_wcs(
        additional_config=AdditionalConfig(timeout=Timeout(init=30, query=60, insert=120)),
        # skip_init_checks=True,
        cluster_url=WCS_URL,
        auth_credentials=weaviate.auth.AuthApiKey(WCS_API_KEY)
      )

adding the timeout config seemed to fix the issues (mostly), and I opt not to do the skip init check.

however, it is still happening intermittently. it tends to happen around

collection.data.delete_many(…)

where I deleted a lot of stuff. But I haven’t tested enough to be sure. It is definitely intermittent since some jobs will run through fine. And using tenacity on retry didnt seem to help (more debugging there needed).

but if you have fixed your problem, and please share. I will detail my setup on another thread if it proves to be very problematic.

Topic		Replies	Views
weaviate.exceptions.WeaviateQueryError: Query call with protocol GRPC search failed with message recvmsg:Connection reset by peer Support integration , python , technical	1	200	November 6, 2024
Query call with protocol GRPC batch failed with message Deadline Exceeded Support	4	2041	March 31, 2025
Error: 'WeaviateBatchError('Query call with protocol GRPC batch failed with message <>) Support	4	327	January 2, 2025
Query call with protocol GRPC search failed with message sendmsg: Socket operation on non-socket (88) Support python	3	255	August 22, 2024
GRPC Query failed AioRpcError of RPC terminated status UNAVAILABLE Support python	5	1054	December 20, 2024

Exception: Query call with protocol GRPC batch failed with message recvmsg:Connection reset by peer

Description

Server Setup Information

Any additional Information

Related topics