[Docs] Image+text hybrid search on cross references

Sik819 · November 27, 2024, 8:52am

Hi Weaviate Community,

I am currently working with the multi2vec-clip module to perform hybrid text+image searches.

I have a class called Dog that includes the following properties:

breed
color
description
A one-to-many cross-reference to a collection called Image.

The Image class has an encodedImage property that stores the blob data of the image.

Now,

How can I perform a hybrid search that includes the breed, color, and description properties from the Dog class and the cross-referenced encodedImage property from the Image class?
Can a search be performed directly on cross-referenced properties, or would it have been better to store the encodedImage property within the Dog class itself?

Below is the code so far:


    var hybridSearch *graphql.HybridArgumentBuilder
    var nearImage *graphql.NearImageArgumentBuilder


    textQuery := searchQuery.Breed + " " + searchQuery.Color + " " + searchQuery.Description

    // handle image input searches
    if searchQuery.Image != nil && searchQuery.Image.EncodedImage != nil {
        // Base64 encode the image
        encodedImage := base64.StdEncoding.EncodeToString(searchQuery.Image.EncodedImage)

        nearImage = db.Client.GraphQL().NearImageArgBuilder().
            WithImage(encodedImage).
            WithDistance(0.7)
    }


    hybridSearch = db.Client.GraphQL().HybridArgumentBuilder().
        WithQuery(textQuery).
        WithAlpha(0.5) 

    // GraphQL query
    fields := []graphql.Field{
        {Name: "breed"},
        {Name: "color"},
        {Name: "description"},
        {Name: "_additional", Fields: []graphql.Field{
            {Name: "certainty"},
        }},
        {Name: "images", Fields: []graphql.Field{
            {Name: "encodedImage"},
        }},
    }

    result, err := db.Client.GraphQL().Get().
        WithClassName("Dog"). // Search can only be done on Dog? How about its cross referenced Image collection?
        WithNearImage(nearImage).
        WithHybrid(hybridSearch).
        WithFields(fields...).
        Do(context.Background())

sebawita · November 27, 2024, 2:08pm

Hi @Sik819, welcome to the community.

Unfortunately, you can’t query two separate collections with a single query.
Cross references allow you to pull referenced data from another collection, but the query elements (both for vector and keyword search) happens on the queried collection.

No, and yes. You need to bring both the image and the other properties into the same class.

Side note

Btw. clip allows you to vectorize text (i.e. breed, color) and images.
Meaning your vector queries will work across the concepts in both text and image properties.

However, please note that Clip is not as powerful when it comes to text search, as you would expect from other embedding models that are specialised on working with text.

It might still work for what you need it to be, but there might be limitations

Sik819 · December 5, 2024, 8:03pm

Thanks, @sebawita!

I tried bringing the encodedImage property into the Dog class, but it seems arrays for base64 blob properties aren’t supported. Is there any alternative way to store multiple images within the same class?

Given the limitations with CLIP and the challenge of storing multiple images (if no workaround is available), would it make more sense to perform two separate searches—one for text and one for images—and then manually combine and analyze the results?

sebawita · December 6, 2024, 5:31pm

You are correct, you can’t provide an array of images.
(warning! I haven’t tried this, so this might not work )
However, you could use a trick to map images to multiple properties. Like this:

client.collections.create(
    name="MyCollection",
    vectorizer_config=Configure.Vectorizer.multi2vec_clip(
        text_fields=["text"],
        image_fields=["image1", "image2", "image3"],
    ),
    properties=[
        Property(name="text", data_type=DataType.TEXT),
        Property(name="image1", data_type=DataType.BLOB),
        Property(name="image2", data_type=DataType.BLOB),
        Property(name="image3", data_type=DataType.BLOB)
    ]
)

Then, when you insert an object, you can move the base64 images from the array to each property.

Note, with this approach, you will get one vector embedding that is an average of all images.

If you need a separate vector embedding per image, then you need to use named vectors:

client.collections.create(
    name="MyCollection",
    vectorizer_config=[
        Configure.NamedVectors.multi2vec_clip(
            name="first_vector",
            text_fields=["text"],
            image_fields=["image1"],
        ),
        Configure.NamedVectors.multi2vec_clip(
            name="second_vector",
            text_fields=["text"],
            image_fields=["image2"],
        ),
        # ...
    ],
    properties=[
        Property(name="text", data_type=DataType.TEXT),
        Property(name="image1", data_type=DataType.BLOB),
        Property(name="image2", data_type=DataType.BLOB),
        Property(name="image3", data_type=DataType.BLOB)
    ]
)

Yes, that could work. You could use the named vectors approach.
Then, have one named-vector for your images and another named-vector for your text properties.

Jack_Dim · June 11, 2025, 3:12am

Another Q:

    result, err := db.Client.GraphQL().Get().
        WithClassName("Dog"). // Search can only be done on Dog? How about its cross referenced Image collection?
        WithNearImage(nearImage).
        WithHybrid(hybridSearch).
        WithFields(fields...).
        Do(context.Background())

If python client v4, how search with hybrid and near_image?

Mohamed_Shahin · June 12, 2025, 12:14pm

Hi @Jack_Dim,

Here are the relevant resources for performing image and hybrid search using the Python v4 client:

Hybrid search in Weaviate is only for combining keyword and vector (text) search. If you want to search with images, use vector search (nearImage), not hybrid search.

Best regards,

Mohamed Shahin
Weaviate Support Engineer
(Ireland, GMT/UTC timezone)

Topic		Replies	Views
Cross-references with multi-tenant classes and Hybrid search General bug	1	239	April 3, 2024
Bug: Hybrid search broken when using cross-references on a class with multi-tenancy enabled Support bug	2	537	August 3, 2023
Using cross-reference properties with generative search Support	7	679	June 13, 2023
[Question] Hybrid GroupBy with cross-ref Support python	3	221	August 1, 2024
Text search and multiple embeddings Support	4	374	September 19, 2024

[Docs] Image+text hybrid search on cross references

Side note

Related topics