While creating collection, what happens if dont mention the vector_index_config, is the default distance_metric “Cosine”
Is it neccesary for me to mention vector_index_config, if am using text2vec_azure_openai and let weaviate create embedding on its own using the cred i give it, wont it by default use Cosine distance metric when I query from collection
Got below from doc
client.collections.create(
"Article",
# Additional configuration not shown
vector_index_config=Configure.VectorIndex.hnsw(
quantizer=Configure.VectorIndex.Quantizer.bq(),
ef_construction=300,
distance_metric=VectorDistances.COSINE,
filter_strategy=VectorFilterStrategy.SWEEPING # or ACORN (Available from Weaviate v1.27.0)
),
)
My next question is: are the created embeddings from Weaviate normalized.
Thanks.
hi @Rishi_Prakash !!
If you do not provide any values to vector_index_config, it will set the default ones.
for example, when you create a collection like this:
collection = client.collections.create("DefaultCollection")
print(collection.config.get(). vector_index_config.to_dict())
{‘cleanupIntervalSeconds’: 300, ‘distanceMetric’: ‘cosine’, ‘dynamicEfMin’: 100, ‘dynamicEfMax’: 500, ‘dynamicEfFactor’: 8, ‘ef’: -1, ‘efConstruction’: 128, ‘filterStrategy’: ‘sweeping’, ‘flatSearchCutoff’: 40000, ‘maxConnections’: 32, ‘skip’: False, ‘vectorCacheMaxObjects’: 1000000000000}
The created embeddings are stored as it is.
Those configurations change how the distances will be calculated and searched and how the index will be built.
Now, if you compress those vectors, for example, Weaviate will now keep both the original and the compressed vectors on disk, but only the compressed vectors on memory.
Let me know if this helps!
Thanks!