Search with groupby: only distance may be displayed in metadata query

Description

I tried to add groupby feature in my search code(hybrid, similarity and keywords). but I found a wired results:

  1. I set return_metadata=MetadataQuery(distance=True, score=True, explain_score=True).
  2. do three searchs(hybrid, similarity and keywords) with the same collection and the same query.
  3. In hybrid search results: no score and explain_score. Only distance info will be displayed
  4. In similarity search results: no score and explain_score(yes, similarity search will not return such results), the distance results seems resonable.
  5. In keywrod search results: no score and explain_score. All of distance results are 0 (it’s ok because keyword search will not return distance info)
  6. when I remove the groupby parameter, all response will display metadata correctly.

It seems there is a inner order/prompt in the groupby feature: only return distance results. If not, return 0.

Here is my code part and results:

....
        # groupby setting
        groupby_setting = GroupBy(prop="file_id", 
                                  number_of_groups=5, 
                                  objects_per_group=1)

        # hybrid search
        alpha = config.get_setting('search')['revrieval_alpha']
        hybrid_response = self.db_instance.child_collection.query.hybrid(
            query=query_content,
            query_properties=["content"],
            target_vector="content_vector",
            return_metadata=MetadataQuery(distance=True, score=True, explain_score=True),
            limit=config.get_setting("search")["revrieval_num"],
            alpha=alpha,
            group_by=groupby_setting,
            filters=condition_filter)

        # keyword search
        keyword_rsponse = self.db_instance.child_collection.query.bm25(
            query=query_content,
            query_properties=["content"],
            # target_vector="content_vector",
            return_metadata=MetadataQuery(distance=True, score=True, explain_score=True),
            limit=config.get_setting("search")["revrieval_num"],
            # alpha=1-alpha,
            group_by=groupby_setting,
            filters=condition_filter)
        
       # similarity search
        similar_response = self.db_instance.child_collection.query.near_text(
            query=query_content,
            # query_properties=["content"],
            target_vector="content_vector",
            return_metadata=MetadataQuery(distance=True, score=True, explain_score=True),
            limit=config.get_setting("search")["revrieval_num"],
            # alpha=alpha,
            group_by=groupby_setting,
            filters=condition_filter)

        print('hybrid search')
        for key, value in hybrid_response.groups.items():
            print(key)
            print(value)
            print('#################')

        print('keywords search')
        for key, value in keyword_rsponse.groups.items():
            print(key)
            print(value)
            print('#################')

        print('similarity search')
        for key, value in similar_response.groups.items():
            print(key)
            print(value)
            print('#################')

printed outputs:

hybrid search
6666
Group(name='6666', min_distance=0.5924214124679565, max_distance=0.5924214124679565, number_of_objects=1, objects=[GroupByObject(uuid=_WeaviateUUIDInt('7a455a5f-8287-428c-b3b6-fade1e4395e2'), metadata=GroupByMetadataReturn(distance=0.5924214124679565), properties={'chunk_id': 28, 'parent_uuid': 'f874a9c4-0f81-489e-bd26-2940f3dea768', 'file_id': '6666', 'user_id': 'tangliuzhao', 'chunk_page_number': -1, 'chunk_type': 'text', 'content': '文件的文件名 “他是个伟大的人。'}, references=None, vector={}, collection='Knowledge_child_collection', belongs_to_group='6666')], rerank_score=0.0)
#################
keywords search
6666
Group(name='6666', min_distance=0.0, max_distance=0.0, number_of_objects=1, objects=[GroupByObject(uuid=_WeaviateUUIDInt('7ee72cda-f72b-43b2-a156-1c4477910fa8'), metadata=GroupByMetadataReturn(distance=0.0), properties={'chunk_type': 'text', 'parent_uuid': 'f6b7e7a8-1c3a-4034-913c-58dff065e138', 'file_id': '6666', 'user_id': 'tangliuzhao', 'chunk_page_number': 0, 'chunk_id': 4, 'content': '文件的文件名 - Slide Page: 1 Huang Nan: A versatile talent with multiple fields of development'}, references=None, vector={}, collection='Knowledge_child_collection', belongs_to_group='6666')], rerank_score=0.0)
#################
similarity search
6666
Group(name='6666', min_distance=0.5924214124679565, max_distance=0.5924214124679565, number_of_objects=1, objects=[GroupByObject(uuid=_WeaviateUUIDInt('7a455a5f-8287-428c-b3b6-fade1e4395e2'), metadata=GroupByMetadataReturn(distance=0.5924214124679565), properties={'chunk_id': 28, 'parent_uuid': 'f874a9c4-0f81-489e-bd26-2940f3dea768', 'file_id': '6666', 'user_id': 'tangliuzhao', 'chunk_page_number': -1, 'chunk_type': 'text', 'content': '文件的文件名 “他是个伟大的人。'}, references=None, vector={}, collection='Knowledge_child_collection', belongs_to_group='6666')], rerank_score=0.0)
#################

Now I want to know how to return a “groupby” result based on score in hybrid search and keywords search not just based on distance.

Server Setup Information

  • Weaviate Server Version:1.31.2
  • Deployment Method: docker
  • Multi Node? Number of Running Nodes: 1 node
  • Client Language and Version: 4.15.2
  • Multitenancy?: No

Any additional Information