Is there a way to upload tons of data, then ask weaviate to build its index once all embeddings and data are uploaded hopefully increasing the speed of upload dramatically (my uploads are seeing 5*slower as the database grows to 1million 384d vectors for my setup).
We have been cooking some new features to address this. The idea is to have async index building. This, on top of GRPC will deliver improved import times.
Ah yes - this is what I am looking for. For massive data entry it would be extremely useful. Even as a flag for on/off building the index so setup is quick even if later additions are slow. But the solution suggested (ie: async) works too - more sophisticated.
Any clue timeline for feature? Like 1 month or 1-2 years?
AFAIK, it will balance between the ingest operations and the index operations. The idea is to avoid having the load of ingesting and indexing running at the same time and not performing well on both. So I believe it will control the server load and keep the index queue grow when the ingestion is in a higher rate.
There is more info here too, as well on the aforementioned PR: