How to create vector embedding for multiple fields

I want to store embedding separately for two fields. For example, I am creating a schema like this

           skills = self.client.collections.create(
                name=self.collection_name,
                vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_azure_openai(
                    deployment_id="text-embedding-ada-002",
                    resource_name="",
                    vectorize_collection_name=True
                ),
                properties=[
                    wvc.config.Property(
                        name="skill_name",
                        data_type=wvc.config.DataType.TEXT,
                        vectorize_property_name=True
                    ),
                    wvc.config.Property(
                        name="description",
                        data_type=wvc.config.DataType.TEXT,
                        vectorize_property_name=True
                    ),
                    wvc.config.Property(
                        name="etldatetime",
                        data_type=wvc.config.DataType.TEXT,
                        skip_vectorization=True
                    ),
                ]
            )

now i want skill_name and description embedding to be created separately,
share me the query for it, also how to search from the collection similarly, Thanks

hi @Rishi_Prakash !!

This is a named vector (multi vector) use case.

When you define a single vectorizer, just like you have done, Weavaite will concatenate all “vectorizable” properties.

Now, if you want to create vectors for specific properties (single o multiple properties), you should define different named vectors, as stated here:

with that said, this is how you collection should be created:

client.collections.delete("Test")
skills = client.collections.create(
                name="Test",
                vectorizer_config=[
                    wvc.config.Configure.NamedVectors.text2vec_openai(
                        name="skill_vector", vectorize_collection_name=True,
                        source_properties=["skill_name", "etldatetime"]
                    ),
                    wvc.config.Configure.NamedVectors.text2vec_openai(
                        name="description_vector", vectorize_collection_name=True,
                        source_properties=["description"]
                    )                    
                ],
                properties=[
                    wvc.config.Property(
                        name="skill_name",
                        data_type=wvc.config.DataType.TEXT,
                        vectorize_property_name=True
                    ),
                    wvc.config.Property(
                        name="description",
                        data_type=wvc.config.DataType.TEXT,
                        vectorize_property_name=True
                    ),
                    wvc.config.Property(
                        name="etldatetime",
                        data_type=wvc.config.DataType.TEXT,
                        skip_vectorization=True
                    ),
                ]
            )
skills.data.insert({"skill_name": "this is a skill", "description": "This is a skill desc", "etldatetime": "some etldatetime"})

note that you will get now two vectors, as below:

o = skills.query.fetch_objects(include_vector=True).objects[0]
o.vector.keys()

Output:

dict_keys([‘skill_vector’, ‘description_vector’])

Let me know if this helps!

Thanks!

1 Like