Changing the vectorization model of a collection would trigger a re-vectorization of the entire collection, which Weaviate does not support.
This is not possible as of now, as well as adding a named vector, as we have not implemented async vectorization.
The only way for now is to create a new collection with the new vectorizer configuration, and then copy the data from the old collection to the new one.
What I thought I understood was that I needed to create a new schema, which would create a new collection, and then migrate the data from the old collection to the new.
So, in my case, I would create a new schema SolrCopy02 which would then create the SolrCopy02 collection. Then I would migrate the SolrCopy01 collection data to the SolrCopy02 collection.
What you wrote above doesn’t sound like that process. Or, I’m just not understanding the terminology. But, as gpt-3.5-turbo is a legacy model, I probably need to update soon.
When you configure the vectorizer, it means that your data is going to be vectorized with that model.
So for example, let’s say you select a vectorizer that embed vectors with 300 dimensions.
Now you want to change to a different model with, let’s say, 1536 dimensions.
You will need to vectorize all your content again, because the vectors came from different models. Even if they had the same dimensionality, they came from different models.
So in order to change the vectorizer, you need to both define the model accordingly and to vectorize all your content again.
So the vectorizer configuration of a collection is not mutable.
Since Weaviate 1.27 version, the generative configuration is now mutable. This means that if you configured your collection to use, for example, cohere as the generative, you can change it to open ai, for example.
gpt-3.5, gpt-4 and so on, is a generative configuration.
From the explanation above, it sounds like all I need to do is update the schema and I’m done. But, you also state that since I’m changing the vectorizer, I need to re-vectorize the content – which makes sense. And if that’s the case, I’m really talking about migrating content from SolrCopy01 to SolrCopy02, as opposed to just modifying the schema of SolrCopy01.
From what you wrote, you do not actually want to change the vectorizer, in both your schemas you state it as text2vec-openai. So the embedding model will stay the same and with that, you can keep your collection with the precalculated embeddings of that type.
The generative part is different from the vectorizer, it is used in RAG to come up with a response suitable returning to a user query at runtime (while embeddings are calculated during data ingestion). So it makes sense that Weaviate does not have a problem with the user changing that setting lateron. Hope that clarifies things - if not, maybe check out the documentation here to read more on the topic: Generative AI | Weaviate