Hello all,
I’m using client v4, and i have added some documents as i usually do with no trouble.
This time around i have added a ‘release_date’ property to my documents (i’m using Langchain, this is the Document Metadata which i believe is the object property here)
I need to execute a similarity search using near_text, but I also want to rank or sort my results using the relase_date property so I can get the latest results first.
The sorting API seems like can only be included with fetch_objects, and not near_text or hybrid. So it’s no possible to search my db and sort the results
Doing something like:
collection.query.near_text(
...: query="test query",
...: limit=25,
...: return_metadata=MetadataQuery(distance=True),
...: sort=Sort.by_property(name='release_date', ascending=False)
...: )
This results in an error:
_NearTextQueryAsync.near_text() got an unexpected argument 'sort'
Is it not possible to search with sorting?
Thanks
Good morning @joe-barhouch,
Welcome to our community—it’s great to have you here!
Sorting isn’t directly available with vector searches. But you can use the rerank feature to kinda “sort” the results. I’m also looking into improving the documentation to clarify about sorting further so it does not cause confusion.
It’s important to note that rerank works after the initial search, meaning it doesn’t sort the entire dataset but instead reorders the search results based on relevance. However, it should still provide a reasonable approximation for sorting by attributes.
Regards,
Mohamed Shahin
Weaviate Support
Rerank might be an overkill for sorting by date.
Maybe you could sort the result objects in python, like this:
result = temp.query.near_text(
query="test query",
limit=25,
return_metadata=MetadataQuery(distance=True),
)
sorted_results = sorted(result.objects, key=lambda x: x.properties["release_date"], reverse=True)
for item in sorted_results:
print(item.properties)
1 Like
Thank you @Mohamed_Shahin and @sebawita
I agree with you guys, i ended up sorting the results after the search using something similar to what was showcased.
Thank you for considering updating the documentation. I think it’s important to see more relevant examples of what and how the additional operators work. The doc is good for now but it’s sometimes difficult to understand exactly what’s going on.
Additionally, the Ask Documentation bot you guys have is helpful, but it will mix details from client v3 and client v4 which just gives completely wrong answers. I tried to use it to get some form of reranking working, but the implementation given was wrong (but that’s not related to this task only)
2 Likes