Need Programming Language Agnostic Migration Instructions

SomebodySysop · January 29, 2024, 8:07am

I need to move to a higher availability cluster. The instructions I received from support:

In order to take full advantage of multiple nodes, your class must be configured to have multiple shards or replicate the data in multiple nodes.

So, simply adding a new node will not be enough to have a multiple node cluster. This will be possible in the future with dynamic scaling feature.

With that said, the best solution is to create a new cluster (marking it as HA) and move your data there.

For that, you can create your class in your new cluster, specifying the sharding and replication config, and move your data over using this migration guide:

Migrate data | Weaviate - Vector Database

Not sure about most of this, but in looking at the migration documentation, I know for sure I need instructions on how to do this using API commands (i.e. cURL) as I do not use python at all.

So, what I get so far is that I need to create a new cluster, and then a new class object within that cluster (ideally, a duplicate of my existing class). But from there, I need some instructions on how to do the migration itself.

I also would like to take advantage of the new OpenAI embedding models.

DudaNogueira · January 29, 2024, 1:42pm

Hi!

While you can do it with curl alone, processing the response in order to reindex on your new cluster will need some data handling

What that migration script/guide does is basically get all objects from your cluster, using the cursor api

Then it will reingest it on your new cluster using this API endpoint

Note that the python v4 client will use GRPC connection, making it way performant than http/rest. So if you have a lot of data, using the python client comes handy.

Let me know if that helps.

SomebodySysop · January 29, 2024, 7:23pm

My other alternative is to simply create the new cluster, create a new class object, and re-embed my data directly to the new object. Correct?

DudaNogueira · January 29, 2024, 7:33pm

That’s the same alternative… isn’t?

Using python client or curl, you are basically moving data around.

As you want to use the new OpenAi embedding models, you will create the class first, with the proper configuration, and then copy the data from the old cluster.

However, you will not provide the vector when inserting those objects. This will trigger Weaviate to vectorize your data, now using the new model.

If you have this data elsewhere, for example, a pipeline that extracts and load into Weaviate, you could use it to reindex your data on your new cluster, with the new OpenAi model.

Let me know if this clairfies it for you

Also, feel free to ping me in our Slack if you need more clarification or sharing more details.

Thanks!

SomebodySysop · January 29, 2024, 7:52pm

I do not want to copy the data from the old cluster as it requires using python, and I don’t want to be forced to use python.

What I am saying is that I want to create a new cluster, create a new object, then create a NEW embedding with my source content. It seems easier to me to simply proceed as if I were starting from the beginning.

Topic		Replies	Views
Creating a Cluster via Python General	3	54	July 2, 2025
Migration cause Weaviate Cloud cluster restart all the time Support technical	2	197	February 16, 2025
Migrate references from one collection to other Support developer-experience	1	411	February 16, 2024
One time Indexing setup with Weaviate in azure Support	1	263	January 26, 2024
How to migrate data from your local Weaviate to WCS (SaaS)? Resources	0	598	May 30, 2023

Need Programming Language Agnostic Migration Instructions

Related topics