Description
In the docs, there’s a note that says vectors are normalized for cosine similarity, and then we use dot product.
Wouldn’t that mean that the cosine distance ends up being -1 <= distance <= 1
? Or do we do 1 - dot(a,b)
in the code?
I’m asking because we found negative distance using cosine when returning them in metadata.
hi @Guillermo_Ripa !!
That’s an interesting question.
I will need to ask internally for more context on this.
I’ll get back with more info. Thanks!
The cosine distance should never be negative, it is defined as 1 - dot(a,b)
in the code:
Here in the tests you can also see the expected distance measures, e.g. opposing vectors lead to cosine distance 2:
Could you give us an example where there was a negative cosine distance calculated?
Thanks @DudaNogueira and @andrewisplinghoff for the code snippets! That’s really helpful.
The distance was a -1e-5, so I brushed it off as a floating point error. But it got me looking into docs.
The code snippet and the unit test puts my doubt to rest. thank you
Great!! Thanks for jumping in, @andrewisplinghoff !
That was the code that our team pointed me to
Thanks!!