Hi fellow Weaviaters I wish to test around 50.000 text objects with unique IDs, each with around 10 different embedding models.
So I will be looping in my text list and manually build a vector_MODELNAME for each of the models I want to evaluate.
How would you efficiently store this in weaviate for this benchmarking case?
I can think of having one collection for each vectorizer.
Even though I am aware of named vectors, as I am not able to use any of the modules such as text2vec-transformers, I am not quite sure how to define/use named vectors with application provided vectors/embeddings.
I will then build a number of text queries each with a list of IDs of expected matching phrases as my ground truths.
Finally I will go though my list of queries, vectorrize them with each vectorizer and with each retrieve the phrases via the corresponding phrases and then match them to the ground truth.
I would love if you could suggest the best approach for this task.
Thank you