Finetuning to multi2vec-clip with Fashionpedia

Jjen_95 · July 2, 2024, 3:43am

Hello, I would like to know if multi2vec-clip was trained with Fashionpedia, if not I would like to know if you can finetuning the model so you can better distinguish the clothes, I could run this Building Multimodal AI in TypeScript | Weaviate - Vector Database and it works to detect several garments, but I understand that if you do finetuning makes it more “powerful”.
I also have other questions
what GPU do i need for finetuning? a rtx 3060m is enough?
does finetuning for a multimodal model is the same as for a text-only model like llama3-8B?
well, if someone has knowledge it would help me a lot with some advice or some prompt to send to the chatgpt to make me an explanation and a guide to get started
Thanks and greetings

DudaNogueira · July 3, 2024, 6:54pm

hi @Jjen_95 !!

Welcome to our community

You can definitely train your own clip model and replace it with your own to be used on this container image:

According to that repo docs, this is the model used by default:

Let me know if this helps!

Thanks!

Topic		Replies	Views
"multi2vec-clip" performance vs "text2vec-openai" General	3	574	September 15, 2023
Custom Model integration Instead of CLIP Support	3	142	October 7, 2024
Multi2vec_clip and multi2vec_bind General python	1	140	March 26, 2025
Q about using multi2vec-clip and wanting ViT-L-14/laion2b_s32b_b82k Support	1	161	June 3, 2024
RAG with image search General	1	132	August 3, 2024

Finetuning to multi2vec-clip with Fashionpedia

Related topics