Collection Size comparison-> Different methods of Vector Embedding

For Fuzzy Name Search, I Created a Collection in Weaviate
•Used NLP to generate vector embeddings but accuracy not satisfactory
•Used FastText for generating vectors , Accuracy lot better but Number of Elements too high
•Wrote Custom Vector Embedding Myself (that suits only this type of uses :Nouns,Person Names, Places, Skillsets etc)
•Vector Size is limited and results are accurate
•Need assistance to compare the size of the collections, Indexes and Performance
•With NLP it took nearly 10 minutes to insert 100K records and with Custom Embedding it took 80 seconds only.

How we can compare two different collections with the same schema but different vector embeddings for size,performance in Weaviate ->Please share your understanding.


Bala Ganesan.

Hi @Balachandar_Ganesan !!

This is a complex topic :slight_smile:

Have you seen this repo?