Suppose I have a product schema. Will similarity search work better if i store different product attributes like productprice, productdescription etc as different columns in the schema or if i just keep two columns one will have the productName and the other column productDetails will have a summary of all product attributes e.g. prodPrice-3,prodDesc-test,prodType-new.
Will similarity search work in same way for both cases and will give same output?
Also, Is there a way to get unique objects when doing similarity search? E.g.- product schema has productName and productDetails column and I have multiple objects with same productName present in my schema but I want only unique productNames from similarity search output.
Can someone please assist with my query?
Thanks in advance.
Hi @Sriparna (I’ve moved this to support from general)
Similarity searches are based on the vector of each object. This is in turn based on the provided text properties.
In general terms the best way to think about it is that the model assesses how similar the ‘meaning’ of texts are. This might be useful (Vector Embeddings Explained | Weaviate - vector database) as a general guide.
As to getting unique objects, you can group objects by a property - so you could potentially use that to get only one object with a particular product name, for example