How to create vector embedding for multiple fields

Rishi_Prakash · October 6, 2024, 11:48am

I want to store embedding separately for two fields. For example, I am creating a schema like this

           skills = self.client.collections.create(
                name=self.collection_name,
                vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_azure_openai(
                    deployment_id="text-embedding-ada-002",
                    resource_name="",
                    vectorize_collection_name=True
                ),
                properties=[
                    wvc.config.Property(
                        name="skill_name",
                        data_type=wvc.config.DataType.TEXT,
                        vectorize_property_name=True
                    ),
                    wvc.config.Property(
                        name="description",
                        data_type=wvc.config.DataType.TEXT,
                        vectorize_property_name=True
                    ),
                    wvc.config.Property(
                        name="etldatetime",
                        data_type=wvc.config.DataType.TEXT,
                        skip_vectorization=True
                    ),
                ]
            )

now i want skill_name and description embedding to be created separately,
share me the query for it, also how to search from the collection similarly, Thanks

DudaNogueira · October 6, 2024, 5:39pm

hi @Rishi_Prakash !!

This is a named vector (multi vector) use case.

When you define a single vectorizer, just like you have done, Weavaite will concatenate all “vectorizable” properties.

Now, if you want to create vectors for specific properties (single o multiple properties), you should define different named vectors, as stated here:

with that said, this is how you collection should be created:

client.collections.delete("Test")
skills = client.collections.create(
                name="Test",
                vectorizer_config=[
                    wvc.config.Configure.NamedVectors.text2vec_openai(
                        name="skill_vector", vectorize_collection_name=True,
                        source_properties=["skill_name", "etldatetime"]
                    ),
                    wvc.config.Configure.NamedVectors.text2vec_openai(
                        name="description_vector", vectorize_collection_name=True,
                        source_properties=["description"]
                    )                    
                ],
                properties=[
                    wvc.config.Property(
                        name="skill_name",
                        data_type=wvc.config.DataType.TEXT,
                        vectorize_property_name=True
                    ),
                    wvc.config.Property(
                        name="description",
                        data_type=wvc.config.DataType.TEXT,
                        vectorize_property_name=True
                    ),
                    wvc.config.Property(
                        name="etldatetime",
                        data_type=wvc.config.DataType.TEXT,
                        skip_vectorization=True
                    ),
                ]
            )
skills.data.insert({"skill_name": "this is a skill", "description": "This is a skill desc", "etldatetime": "some etldatetime"})

note that you will get now two vectors, as below:

o = skills.query.fetch_objects(include_vector=True).objects[0]
o.vector.keys()

Output:

dict_keys([‘skill_vector’, ‘description_vector’])

Let me know if this helps!

Thanks!

Topic		Replies	Views
How to define a collection with named vectors without using internal embedding models Support	2	709	April 17, 2024
Create object with named vector without configure in collection Support	6	634	April 9, 2024
Best way to Vectorize Multiple Fields General	1	158	April 17, 2025
Text search and multiple embeddings Support	4	403	September 19, 2024
Collection is configured without multiple named vectors, but received named vectors Support	2	188	February 22, 2025

How to create vector embedding for multiple fields

Related topics