Text search and multiple embeddings

Hi,

I have an app which does multimodal searches using CLIP. The nearObject and hybrid searches for that work fine. I’d also like to be able to do text searches in the future, meaning I’d have a separate vector for pure text embeddings. I tested it and the results were pretty bad (for example search for apple laptop returns applejuice while I have a lot of macbooks embedded in the db). Another thing I’m wondering about is if we can do separate vector embedding later one, meaning I’d like to do CLIP embeddings now and do the text embeddings later on at some point. Is that possible to do?

Hi @vrano, I hope you’re having a great week! :hugs:

I completely understand your need to achieve good results both ways. Based on your message, I believe you could benefit from multi-vector capabilities.

Have you seen this feature in Weaviate?

Additionally, check out this guide for searching with multiple vectors:

Hi @Mohamed_Shahin,

can we set other vectors after creating the schema. I’d like to set it up with one vector now, and add other vectors now, meaning now I’d like to embed image and title, and possibly create an embedding for title only later down the line.

Could you also give me guidance for the inquiry about the text search being very inacurrate - I embedded my product titles using text2vec-cohere, but when searching I got pretty bad results as mentioned in my post. I made sure to specify the target vector when using nearText search

Hi @vrano,

Sure thing! I’m always happy to help! :hugs:

For your first point, yes, you can add properties and indexes to a collection after it’s created, but there’s an important aspect to keep in mind:

  • If you add a new property before importing data, there’s no impact on indexing.
  • However, if you add a property after importing data, the indexing of existing objects won’t be updated automatically. Pre-existing objects won’t be indexed with the new property, and queries might return unexpected results because the index only includes new objects that have the added property.

For the second part, please share the query you’re running, the result you’re seeing, and what you’re expecting. That way, I can take a closer look and provide some guidance.

I think at the moment you need to create all named vectors at the beginning.
If you supply the vectors yourself (dont use the weaviate vectorizer feature) you could add all named vectors you might need in the future and then only provide the clip ones at the moment and add any other embedding later.

We will add adding named vectors after the initial creation at some point, but I don’t think it is on the roadmap yet