Optimizing Imports between number of nodes & pods

Hi everyone,

I’m working on a script to import around ~200 million records and want it to run around 6 times faster. I have 4 weaviate instances (each node with 4 threads and one weaviate instance per node). I only replicate my data once, and have 4 total shards (1 shard per instance). I’m tuning the parallelism level on my data loading script (right now have found the best performance at 12 threads). The batch size that has reached the best performance is about 5000 records per batch.

To back this, I also have 1 GPU running 1 instance of the inference model.

If I want this to run faster, what are the obvious bottlenecks in this set up? I was surprised when I bumped my number of weaviate instances (nodes) to 8 and added 8 shards, and performance worsened.

Is there a benefit to having more than 1 shard per node (as it relates to imports)?

Hi @Lakshya_Bakshi

I assume you have seen this doc already:

There are some new upcoming features that will help improving import time, like async batch and others.

I believe you pretty much covered the options here. Also consider that 200 milion is a lot, and after importing those data Weaviate still needs to index them into a vector space, so there is a lot going on here.

Let me know if that helps.