Q about using multi2vec-clip and wanting ViT-L-14/laion2b_s32b_b82k

The context: I am in the “bring your own vector” scenario where we populate the collections with our own generated embeddings from OpenClip’s ViT-L-14.

Q:

  1. Does multi2vec support ViT-L-14, the doc mentioned image: cr.weaviate.io/semitechnologies/multi2vec-clip:sentence-transformers-clip-ViT-B-32-multilingual-v1, where can I get a complete list of supported CLIP models? And if the one I wanted is not available, what are my options?
  2. If that particular L14 OpenCLIP model is supported, can I configure to use it only for embedding the query image/text, but suppress during collection import? My guess is a yes based on what I read so far.

This will help us if we should let weaviate perform the query image/text embedding vs. our prototype where a separate server do the embeddings and we provide the query vector as part of the weaviate query. This will also give us idea if we can lower the operational cost. We expect the realtime cost of embedding user’s submitted images/text to far exceed embedding of the imported data.

hi @00.lope.naughts !!

Not sure I totally understood your question/scenario here, but some points:

  • you can bring your own vectors (aka. provide them while importing), while also maintaining a configured vectorizer. If you do not provide the vectors, Weaviate will vectorize the content for you. Of course, they all need to come from the same model.

  • If you can run a model, you can probably use it with Weaviate. If the current integrations do not support your clip model or even proprietary, you can always create your own multi2vec-clip-inference and point the module to it.

Let me know if this helps :slight_smile: