[Question] Multimodal search

Hello,

I’m building a reverse image search using Weaviate. I use CLIP to embed a product’s title and image and store it into the database. I also have additional properties such as product price which aren’t vectorized but might be used for filtering later on. I was wondering if there was a way to do image and text vectorized search without having the vector or object id at hand. So far the only way I have managed to do it is by generating a vector for the image & title I’m using for search, and searching by vector using that. I also wonder what’s the best way to handle deduplication.

Thanks in advance.

hi @vrano !! Welcome to our community :hugs:

For avoiding duplicate entries, the best approach is to leverage deterministic ids, which means that you generate the IDs based on some unique id you may have for your objects.

if you want to do media search, you can use the near_image or near_media (videos/audios) to search. Considering you have vectorizer properly configured on that collection, Weaviate will vectorize the image used in query and use that vector to perform the search.

Here we have a recipe for multimodal, while it is using multi2vec-bind, it should be the same if using clip:

Let me know if that helps!

THanks!

HI @DudaNogueira,

that answers all my questions. Thank you for the quick response!

1 Like

Glad to hear that!

By the way, here is the docs for deterministic ids:

If you have any questions, we are here to help!