Query filter does not return correct result

Hi guys, I have a specific query and my output result is nothing related to my “where_filter” code = ‘ON.OQ’ while the output returns code =‘MSFT.OQ’ . I have tried other filter queries i.e. code=‘ON.ON.OR.ON.DQ.DQ’ and it return code=‘DQ.N’. Is there any problem with some specific keywords when doing filtering? Please suggest any possible cause of this issue and how I can fix it. Many thanks!

Query:

where_filter = {
        "operator": "And",
        "operands": [
            {
                "path": ["link"],
                "operator": "Equal",
                "valueText": 'audit_files/ON.OQ/ON.OQ_FY2022_2022 Sustainability Report.pdf'
            },
            {
                "path": ["code"],
                "operator": "Equal",
                "valueText": "ON.OQ"
            }
        ]
    }
client.query.get(index_name,["source","page_number","link","bounding_regions",'doc_type','structure_type','code','fy']).with_limit(10000).with_additional('id').with_where(where_filter).do()

Return:

{'data': {'Get': {'ESG_Unstructured': [{'_additional': {'id': 'b06cf4a4-584e-41c5-af81-3f13d2394978'},
     'bounding_regions': '[[0.5031854111416221, 0.3593986031565095], [0.5031854111416221, 0.6948810545716563], [0.6791396722560976, 0.6948810545716563], [0.6791396722560976, 0.3593986031565095]]',
     'code': 'MSFT.OQ',
     'doc_type': 'main',
     'fy': '2022',
     'link': 'audit_files/MSFT.OQ/MSFT.OQ_FY2022_2022 Environmental Sustainability Report.pdf',
     'page_number': '[14]',
     'source': 'MSFT.OQ-3-table-3',
     'structure_type': 'table'},
    {'_additional': {'id': 'c319ac18-b6b0-42af-a71b-c22e5a3df7b5'},
     'bounding_regions': '[[0.08759188309506034, 0.16975828195664666], [0.08759188309506034, 0.6612287455747702], [0.5172488247316656, 0.6612287455747702], [0.5172488247316656, 0.16975828195664666]]',
     'code': 'MSFT.OQ',
     'doc_type': 'main',
     'fy': '2022',
     'link': 'audit_files/MSFT.OQ/MSFT.OQ_FY2022_2022 Environmental Sustainability Report.pdf',
     'page_number': '[2]',
     'source': 'MSFT.OQ-3-table-1',
     'structure_type': 'table'},

Query:


where_filter = {
        "operator": "And",
        "operands": [
            {
                "path": ["code"],
                "operator": "Equal",
                "valueText": "ON.ON.OR.ON.DQ.DQ"
            }
        ]
    }

Return:

{'data': {'Get': {'ESG_Unstructured': [{'_additional': {'id': '55c6b68c-e426-45cf-b409-b2c07599b028'},
     'bounding_regions': '[[0.41823595409093484, 0.3888034437292914], [0.41823595409093484, 0.6233875654660112], [0.6601949966227706, 0.6233875654660112], [0.6601949966227706, 0.3888034437292914]]',
     'code': 'DQ.N',
     'doc_type': 'main',
     'fy': '2022',
     'link': 'audit_files/DQ.N/DQ.N_FY2022_2022 ENVIRONMENTAL, SOCIALAND GOVERNANCE REPORT.pdf',
     'page_number': '[30]',
     'source': 'DQ.N-3242-table-9',
     'structure_type': 'table'},

Hi @eleeag ! Welcome to our community! :hugs:

What version of the server are you using? Can you reproduce this on latest server version and client version?

Thanks!

Hi @DudaNogueira , I am using Weaviate Version 1.22.7 and have not tried it on the latest version. Has there been issues on filtering for the older versions?

Hi @eleeag !

There has been a lot of changes between versions, so we usually advise to try to replicate that issue on latest versions of both server and client.

Also, 1.23+ has better handling of indexing errors, so that may help to avoid this kind of issue.

Let me know if this helps :slight_smile: