Why distance score not equal to 0 (searching exactly the same words)

I create easily sample collection and add data in the collection.

When I try to query that using the exact same words in semantic search, Why distance score (‘A’) from the result not equal to 0 ?

Am I missing something? This is a correct behavior?

Thank You.

Hi! @Khorppun_Sontipanya !!

Welcome to our community!!

This is the correct behavior.

When you query (“A”) gets vectorized, it will not return the exact same vectors. So some difference is expected on the distance calculation.

Take this example:

import weaviate
from weaviate.util import generate_uuid5

client = weaviate.connect_to_local()

client.collections.delete("Test")
collection = client.collections.get("Test")
collection.data.insert({"text": "example a"}, uuid=generate_uuid5("example a"))
collection.data.insert({"text": "example b"}, uuid=generate_uuid5("example b"))
collection.data.insert({"text": "example c"}, uuid=generate_uuid5("example c"))

Now if I search with nearText:

from weaviate.classes.query import MetadataQuery
for object in collection.query.near_text("example a", return_metadata=MetadataQuery(distance=True)).objects:
    print(object)
    print(object.metadata.distance)

I will not get the 0 distance for a.

If you search using nearObject, or nearVector, then you can get it:

from weaviate.classes.query import MetadataQuery
for object in collection.query.near_object(near_object=generate_uuid5("example a"), return_metadata=MetadataQuery(distance=True)).objects:
    print(object.properties)
    print(object.metadata.distance)

will output:

{‘text’: ‘example a’}
0.0
{‘text’: ‘example b’}
0.05546557903289795
{‘text’: ‘example c’}
0.08070778846740723

Let me know if this helps!

Thanks!

Thank you very much for the answer. But when I follow you, why are there no output?

IIRC weaviate also adds the collection name into the vectorized text. Try it again with

Configure.Vectorizer.text2vec_openai(
           ...., vectorize_collection_name=False
        )
1 Like