Low QPS when using gRPC (v4) to batch insert data

Hello everyone.
We are faced with the task to migrate around 10 million records from one weaviate instance to another weaviate cluster hosted on openstack, that has 3 nodes. Our schema’s sharding is set to 3, replication factor is set to 3. We are using weaviate client 4
In our client connection code, in additional config, we have Timeout(init = 120, query = 120, insert = 400)

We have tried multiple experiments and we are getting suspiciously low QPS of around 10-11.

Here are some of the things we’ve tried.

  1. Having bulk 10k records, using dynamic batching with default consistency level - it inserted only about 2k records.
  2. Having bulk 2k records, using dynamic batching with default consistency level - it inserted only about 1.9k records.

The above ones gave us the following error in our logs after the run has been completed

ERROR:weaviate-client:{‘message’: ‘Failed to send all objects in a batch of 903’, ‘error’: ‘WeaviateBatchError('Query call with protocol GRPC batch failed with message <AioRpcError of RPC that terminated with:\n\tstatus = StatusCode.DEADLINE_EXCEEDED\n\tdetails = “Deadline Exceeded”\n\tdebug_error_string = “UNKNOWN:Error received from peer {created_time:“2025-01-22T20:13:54.414688058+00:00”, grpc_status:4, grpc_message:“Deadline Exceeded”}”\n>.')’}ERROR:weaviate-client:{‘message’: ‘Failed to send 903 objects in a batch of 903. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.’}

  1. Having bulk 1k objects, using dynamic batching with default consistency level - the insertion was successful and took about 90 seconds = about 11 QPS
  2. Having bulk 1.5k objects, using dynamic batching with default consistency level - the insertion was successful and took about 134 seconds = about 11 QPS
  3. Having 5 parallel processes, using dynamic batching with default consistency level, each process having about 2k bulk objects - it failed to insert all of them and gave us the same error
  4. Having 5 parallel processes, using dynamic batching with consistency level set to ONE, each process having about 2k bulk objects - it failed to insert all of them and gave us the same error
  5. Having 5 parallel processes, using fixed batching with batch size 200, concurrent requests 2, process having about 2k bulk objects - it failed to insert all of them and gave us the same error
  6. Having 5 parallel processes, using fixed batching with batch size 200, concurrent level 2, consistency level set to ONE, process having about 2k bulk objects - it failed to insert all of them and gave us the same error

We have tried other variations too, mix and match of these, such as each processes having 5k objects to insert, increasing the batch size to 500 in fixed batch size, increasing concurrent requests to 5. However, we always get the gRPC DEADLINE EXCEEDED error after the run in the logs, it doesn’t insert all the objects and with however many objects are inserted, we are getting QPS of 10-11.

Shouldn’t QPS be higher with gRPC? What are the possible causes of this issue?

hi @AnnTade !!

What is the server version?

This scenario point fingers at not enough resource allocated. Do you have any readings from memory?

What is the dimensionality and what was you resource plan?

Also, do you see anything on server logs?

Thanks!