Suggestion for an image caption similarity exercise

Dear experts,
I am really beginning to explore this fascinating world so please bear with my naivety.
I have around 200.00 image captions from a newspaper and want to test if I can retrieve similar captions given one of them. The captions are in Italian. I do have an OpenAI key but would vastly prefer to find embeddings from free/open models.

Just to try the first time I did successfully started a Weaviate container on Linux but can you suggest which configuration I should strive to build?

Thank you very much.

Hi @rjalex !

Are those the image only, the text only, or both?

If mixing modalities (text + image) you can check this cool workshop we did last year:

Also, there is this nice project here:

Let me know if that helps :slight_smile:

1 Like

Will definitely take a look but for now my case would start with a simple text similarity, but multimodality could be cool to explore.