[Docs] Image+text hybrid search on cross references

Hi Weaviate Community,

I am currently working with the multi2vec-clip module to perform hybrid text+image searches.

I have a class called Dog that includes the following properties:

  • breed
  • color
  • description
  • A one-to-many cross-reference to a collection called Image.

The Image class has an encodedImage property that stores the blob data of the image.

Now,

  1. How can I perform a hybrid search that includes the breed, color, and description properties from the Dog class and the cross-referenced encodedImage property from the Image class?

  2. Can a search be performed directly on cross-referenced properties, or would it have been better to store the encodedImage property within the Dog class itself?

Below is the code so far:


    var hybridSearch *graphql.HybridArgumentBuilder
    var nearImage *graphql.NearImageArgumentBuilder


    textQuery := searchQuery.Breed + " " + searchQuery.Color + " " + searchQuery.Description

    // handle image input searches
    if searchQuery.Image != nil && searchQuery.Image.EncodedImage != nil {
        // Base64 encode the image
        encodedImage := base64.StdEncoding.EncodeToString(searchQuery.Image.EncodedImage)

        nearImage = db.Client.GraphQL().NearImageArgBuilder().
            WithImage(encodedImage).
            WithDistance(0.7)
    }


    hybridSearch = db.Client.GraphQL().HybridArgumentBuilder().
        WithQuery(textQuery).
        WithAlpha(0.5) 

    // GraphQL query
    fields := []graphql.Field{
        {Name: "breed"},
        {Name: "color"},
        {Name: "description"},
        {Name: "_additional", Fields: []graphql.Field{
            {Name: "certainty"},
        }},
        {Name: "images", Fields: []graphql.Field{
            {Name: "encodedImage"},
        }},
    }

    result, err := db.Client.GraphQL().Get().
        WithClassName("Dog"). // Search can only be done on Dog? How about its cross referenced Image collection?
        WithNearImage(nearImage).
        WithHybrid(hybridSearch).
        WithFields(fields...).
        Do(context.Background())

Hi @Sik819, welcome to the community.

Unfortunately, you can’t query two separate collections with a single query.
Cross references allow you to pull referenced data from another collection, but the query elements (both for vector and keyword search) happens on the queried collection.

No, and yes. You need to bring both the image and the other properties into the same class.

Side note

Btw. clip allows you to vectorize text (i.e. breed, color) and images.
Meaning your vector queries will work across the concepts in both text and image properties.

However, please note that Clip is not as powerful when it comes to text search, as you would expect from other embedding models that are specialised on working with text.

It might still work for what you need it to be, but there might be limitations :wink:

Thanks, @sebawita!

I tried bringing the encodedImage property into the Dog class, but it seems arrays for base64 blob properties aren’t supported. Is there any alternative way to store multiple images within the same class?

Given the limitations with CLIP and the challenge of storing multiple images (if no workaround is available), would it make more sense to perform two separate searches—one for text and one for images—and then manually combine and analyze the results?

You are correct, you can’t provide an array of images.
(warning! I haven’t tried this, so this might not work :stuck_out_tongue: )
However, you could use a trick to map images to multiple properties. Like this:

client.collections.create(
    name="MyCollection",
    vectorizer_config=Configure.Vectorizer.multi2vec_clip(
        text_fields=["text"],
        image_fields=["image1", "image2", "image3"],
    ),
    properties=[
        Property(name="text", data_type=DataType.TEXT),
        Property(name="image1", data_type=DataType.BLOB),
        Property(name="image2", data_type=DataType.BLOB),
        Property(name="image3", data_type=DataType.BLOB)
    ]
)

Then, when you insert an object, you can move the base64 images from the array to each property.

Note, with this approach, you will get one vector embedding that is an average of all images.

If you need a separate vector embedding per image, then you need to use named vectors:

client.collections.create(
    name="MyCollection",
    vectorizer_config=[
        Configure.NamedVectors.multi2vec_clip(
            name="first_vector",
            text_fields=["text"],
            image_fields=["image1"],
        ),
        Configure.NamedVectors.multi2vec_clip(
            name="second_vector",
            text_fields=["text"],
            image_fields=["image2"],
        ),
        # ...
    ],
    properties=[
        Property(name="text", data_type=DataType.TEXT),
        Property(name="image1", data_type=DataType.BLOB),
        Property(name="image2", data_type=DataType.BLOB),
        Property(name="image3", data_type=DataType.BLOB)
    ]
)

Yes, that could work. You could use the named vectors approach.
Then, have one named-vector for your images and another named-vector for your text properties.