How to create vector embedding for multiple fields

hi @Rishi_Prakash !!

This is a named vector (multi vector) use case.

When you define a single vectorizer, just like you have done, Weavaite will concatenate all “vectorizable” properties.

Now, if you want to create vectors for specific properties (single o multiple properties), you should define different named vectors, as stated here:

with that said, this is how you collection should be created:

client.collections.delete("Test")
skills = client.collections.create(
                name="Test",
                vectorizer_config=[
                    wvc.config.Configure.NamedVectors.text2vec_openai(
                        name="skill_vector", vectorize_collection_name=True,
                        source_properties=["skill_name", "etldatetime"]
                    ),
                    wvc.config.Configure.NamedVectors.text2vec_openai(
                        name="description_vector", vectorize_collection_name=True,
                        source_properties=["description"]
                    )                    
                ],
                properties=[
                    wvc.config.Property(
                        name="skill_name",
                        data_type=wvc.config.DataType.TEXT,
                        vectorize_property_name=True
                    ),
                    wvc.config.Property(
                        name="description",
                        data_type=wvc.config.DataType.TEXT,
                        vectorize_property_name=True
                    ),
                    wvc.config.Property(
                        name="etldatetime",
                        data_type=wvc.config.DataType.TEXT,
                        skip_vectorization=True
                    ),
                ]
            )
skills.data.insert({"skill_name": "this is a skill", "description": "This is a skill desc", "etldatetime": "some etldatetime"})

note that you will get now two vectors, as below:

o = skills.query.fetch_objects(include_vector=True).objects[0]
o.vector.keys()

Output:

dict_keys([‘skill_vector’, ‘description_vector’])

Let me know if this helps!

Thanks!

1 Like