Is there a way to upload tons of data, then ask weaviate to build its index once all embeddings and data are uploaded hopefully increasing the speed of upload dramatically (my uploads are seeing 5*slower as the database grows to 1million 384d vectors for my setup).
We have been cooking some new features to address this. The idea is to have async index building. This, on top of GRPC will deliver improved import times.
Check it out:
Let me know if this helps
Ah yes - this is what I am looking for. For massive data entry it would be extremely useful. Even as a flag for on/off building the index so setup is quick even if later additions are slow. But the solution suggested (ie: async) works too - more sophisticated.
Any clue timeline for feature? Like 1 month or 1-2 years?
This feature should land soon.
so you can keep track of it
Thanks - looks like it has been merged and hopefully part of next release for anyone searching for this.
It states you need to use
ASYNC_INDEXING=true, out of curiosity where would this flag be put? Is it in the docker container config file?
You must have it set as your environment variable and then restart Weaviate.