One time Indexing setup with Weaviate in azure

aravindarajanS · January 26, 2024, 3:17pm

Hi ,

I have my custom model for text vectors , I need to index document (around million) using weaviate on azure vm. how to index these documents one time and use it as many times as I want. is there any document / demo available? I am currently testing various weaviate in locally with docker, the problem is with small sample I can run it every time I rerun the code. how do I handle if I use million of documents, with weaviate and llama index. is there any solution available using azure instances and save all the indexing in a disk and just point the indexer to the disc whenever needed?

Thanks in advance

DudaNogueira · January 26, 2024, 6:03pm

Hi @aravindarajanS ! Welcome to our community!

You can:

backup your data using Azure, S3 or GCS, then restoring it from your destination
Use this migration tool to migrate your data from one cluster to another (or from uma collection to another one in the same cluster, for example)

They key difference here is that while method 1 is basically a copy and paste of your data and index (faster), method 2 will index your data all over again: just the index, the vectors will be be reused (slower)

Method 2 comes handy whenever you want to change an immutable property or configuration in a collection

Let me know if this helps

Thanks!

Topic		Replies	Views
Advice Needed on Optimizing Vector Search in Weaviate Support	1	193	September 6, 2024
Querying on llama-index Weaviate Vector Store General	4	1703	March 9, 2025
Documentation - Maximum index size, disk paging Support	3	771	December 6, 2023
Assistance Needed to Improve Weaviate's Vector Search Performance General	2	342	March 6, 2025
Querying Multiple Indexes in Weaviate v4 Support	1	134	September 4, 2024

One time Indexing setup with Weaviate in azure

Related topics