Hi ,
I have my custom model for text vectors , I need to index document (around million) using weaviate on azure vm. how to index these documents one time and use it as many times as I want. is there any document / demo available? I am currently testing various weaviate in locally with docker, the problem is with small sample I can run it every time I rerun the code. how do I handle if I use million of documents, with weaviate and llama index. is there any solution available using azure instances and save all the indexing in a disk and just point the indexer to the disc whenever needed?
Thanks in advance
Hi @aravindarajanS ! Welcome to our community!
You can:
-
backup your data using Azure, S3 or GCS, then restoring it from your destination
-
Use this migration tool to migrate your data from one cluster to another (or from uma collection to another one in the same cluster, for example)
They key difference here is that while method 1 is basically a copy and paste of your data and index (faster), method 2 will index your data all over again: just the index, the vectors will be be reused (slower)
Method 2 comes handy whenever you want to change an immutable property or configuration in a collection
Let me know if this helps
Thanks!