After importing the document into Weaviate for a period of time, it cannot be searched using BM25,

hi , I found this issue while using Weaviate,It’s normal for me to search immediately after importing,At that time, a few hours later, I found it impossible to search for this document using BM25。I’m sure I haven’t done anything else with Weaviate,But I can search for this document using a filter,Alternatively, using embedding can also be searched for。

I don’t know where the problem is. Can someone help me? Thank you very much

The definition of the search field is:

Property.builder().name("doc_name_analyzed").dataType(Arrays.asList(DataType.TEXT)).tokenization(Tokenization.WHITESPACE).build(),

the doc content:

{"gmt_create":"2023-09-04T07:19:34Z","modify_user_name":"yifeng","tenant_id":"7502","create_user_id":"8752","modify_status":3.0,"create_user_name":"yifeng","doc_name_analyzed":"19file _ partag 的 副本","knowledge_type":3.0,"doc_start_date":"1979-12-31T16:00:00Z","doc_type":"doc","gmt_modified":"2023-09-11T03:11:57Z","doc_id":"424","uuid":"doc-424","url":"doc2bot/docmind/7502/19file_partag的副本_1693811991175.txt","doc_end_date":"1979-12-31T16:00:00Z","doc_name":"19file_partag的副本.txt","doc_status":20.0,"process_status":0.0,"_additional":{"id":"65b79a7a-2708-4eb7-8adc-f84a96c127d6"},"cat_code":"231001001629","cat_path":["-1","231001001629"],"modify_user_id":"8752"}

this is my test code

    @Test
    public void testBm25Search() {

        Result<GraphQLResponse> run = client.graphQL()
                .get()
                .withClassName("Bge768_sandbox_doc2bot")
                .withFields(
                        Field.builder().name("doc_id").build()

                )
                .withBm25(Bm25Argument
                        .builder()
                        .properties(new String[]{"doc_name_analyzed"})
                        .query("副本")
                        .build()
                )
                .withWhere(
                        WhereArgument.builder().filter(
                                WhereFilter.builder()
                                        .operator(Operator.And)
                                        .operands(new WhereFilter[]{
                                                WhereFilter.builder()
                                                        .path("tenant_id")
                                                        .operator(Operator.Equal)
                                                        .valueText("7502").build(),
                                                WhereFilter.builder()
                                                        .path("doc_type")
                                                        .operator(Operator.Equal)
                                                        .valueText("doc").build()
                                        })
                                        .build()
                        ).build()
                )
                .withLimit(2000)
                .run();
        System.out.println(JSON.toJSONString(run));
    }
1 Like

Hi! Maybe you have found this bug?

1 Like

Hi @codehelen - the latest update on this bug is here. BM25 returns no results in some situations (Original title: BM25/Tokenizer not working properly) · Issue #3517 · weaviate/weaviate · GitHub

The team has narrowed it down to a bug that is present for 1.21.0 and up (including the current version 1.21.4), and are working towards a fix.

I am sorry about this, but I understand the core team is working on a fix and will keep everybody updated.

1 Like

Hi @codehelen - I wanted to write to let you know we’ve released 1.21.5 which fixes this bug.

Release notes: Release v1.21.5 - BM25 search no results in some situations Fix and text2vec-huggingface response parsing Fix · weaviate/weaviate · GitHub

Etienne, our CTO, has written a summary of the cause & fix here as well. Fix issue where BM25 would sometimes return no results after a compaction by etiennedi · Pull Request #3592 · weaviate/weaviate · GitHub

We are sorry about this, and thank you for raising it.

1 Like