Filtering date range does not work (combining dates in where clause)

We just upgraded from weaviate 1.15 to the latest version of weaviate (unsure if that is related) but some queries do not work anymore.

We are conducting a search in weaviate, which used to work and still works locally in our dev environment:

const request = client.graphql
    .get()
    .withClassName("TextChunk")
    .withFields(
      `
    document_id,
    lines,
    document {
      ... on Document {
        upload_time
      }
    }`
    )
    .withBm25({ query: text.toLowerCase(), properties: ["lines"] })
    .withWhere({
      operator: "And",
      operands: [
        {
          path: ["document", "Document", "upload_time"],
          operator: "GreaterThanEqual",
          valueDate: startDate.toISOString(),
        },
        {
          path: ["document", "Document", "upload_time"],
          operator: "LessThanEqual",
          valueDate: endDate.toISOString(),
        },
      ],
    })

The query works when removing both or one of the operands. That also means that both operands individually still work. Only the combination using “and” does not work anymore and returns 0 results. We are sure there are results for the given dates and search terms in the db.

What we tried:
Nothing which we tried seemed to be a reasonable cause for the problem as the operators individually still work.
However we noticed we had some duplicate Docs and the TextChunks then were referencing two Docs. So we deleted all duplicate Docs now. No results.
We also noticed we had some TextChunks that referenced Docs that dont exist anymore, so the Upload_time in this query was null. We create the Docs and added the upload time. Again - no effect.

We do not get any errors so at this point we cannot do any further debugging.

Hi @klt ! Sorry for the late response here :frowning:

I would try reindexing this in a new cluster or new class, and check if it will solve.

The fact that works on your development, may indicate that some indexing may be twisted.

You can reindex your data from a new cluster (or even from one class to a new one) using this as a guide:

Let me know if that helps :slight_smile:

Will try! Thanks for the suggestion

Hi @DudaNogueira,

so we tried duplicating the Document collection, because that is the collection that stores the date. However after doing that, we noticed that the query with AND does work on the Document collection itself, only not when resolving the Reference from TextChunk to Document.

So obviously rebuilding the index for the Document collection does not solve the issue.

Do you have any idea what we could do to resolve this? Delete and create the references again? Or would the be a way to build all indexes again? A bit clueless here.

Thank you!

Oh.

The fact that it was giving different results in 2 different servers with the same data could be an index issue.

I believe this is similar to this thread:

Your filter basically says:

fetch me objects from ** TextChunk** that have references with ** upload_time >=X** while simultaneously having references with ** upload_time <=Y** "

Let me know if this clarifies.

Thanks!

Hey @DudaNogueira, jumping in here as another one of the devs working on the same project as @klt.

I believe we have another issue here than the one you are referencing, because we did ensure that each TextChunk is only ever linked to a single Document. So a TextChunk that has a reference with upload_time >= X while simultaneously having a reference with upload_time <= Y should actually only be true if X <= upload_time <= Y on that single linked Document. Now we did also ensure that we have object in our database that do match this query, but we do not get any results.

Hey, any new ideas on this? We could not make any progress with this issue in a month now…

Hi!

Sorry for the delay here. I have escalated this to our engineers.

Thanks!

1 Like