Replica search GET query returns different results


I have Weaviate 1.20.1 running with 2 replicas and 2 shards for a particular class with ~4000000 documents. Doing a dense vector search for weaviate-0 replica skips results with a higher certainty. I am testing each replica by port-forwarding from the pod.
So for example i have two documents DocA /DocB and Query .

certainty of Query & DocA = 0.881561815738678
certainty of Query & DocB = 0.7634845674037933

When i query weaviate-1 to get the top result, it correctly returns DocA for the Query . But when i query weaviate-0 it returns DocB as the closest, even though when i examine DocA in weaviate-0 (by adding the where filter)the certainty is 0.881561815738678 .

Let me know if more details are needed.
Thank you in advanced!

Hi @darpan - that’s odd. I wonder if the query is being run with ConsistencyLevel.ONE somehow.

Could you please try running the same query with different consistency levels (ALL/QUORUM/ONE)?

In Python, for example - it would be set like:


Thanks @jphwang :slight_smile: for your quick reply.
Apologize for the delay in response, with all 3, it returns same results from each container.

To reproduce the issue run the steps below, let me know if you come across anything or have any issues in setting this up:
Related to Slack

  1. Create a new directory
  2. Grab the .gz from this dataset and add it to the new directory gfissore/arxiv-abstracts-2021 at main
  3. unzip contents of “” (Located in this google drive in the new directory
  4. run “docker-compose up -d” to create a 2 instance weaviate
  5. run “python” to create the Article Schema
  6. run “python” to index all documents. This should take a little bit. To speed up, you can bump up the replicas for t2v-transformers if you have more resources.
  7. Run the GQL query below against both http://localhost:6001/v1/graphql and http://localhost:6002/v1/graphql to see the difference
    Get {
            nearText: {concepts: ["A pilgrimage to gravity on GPUs"]
            limit: 12
        ) {
            _additional {

Hi @darpan I wrote a reply but I see you’re getting assistance from Parker and he would know much better then me :slight_smile:

Thanks @jphwang :slight_smile:

I will update here when its resolved.

1 Like

@darpan would you be willing to upgrade to v1.21.2? We have since included some changes that improve the resiliency of replicated search.

I did setup a cluster to try and reproduce your issue according to the steps above, but everything seemed to work as expected for me.

Maybe the upgrade will clear up your issue

Thanks @parkerduckworth for taking a look at this, i know this is a difficult one.

I have had to scale down the instance to single node for our production for now to avoid complaints. When I upgrade/recreate the schema in the next few weeks, i will try to use the latest version and see if the issue is still occurring.

Will update here when i find something.

Hi @jphwang, could you please clarify my query?
Let’s say. I have created a schema by ingesting into weaviate database. when I’m searching a for a query, everytime it provides different number of recommendations. Although they are correct but I somewhat want to fix those recommendations. Is it possible?

Hi @spark and welcome!

Would you mind clarifying your query a bit further? Would you have a code example to share, and perhaps let us know in what way the results vary?


let’s say if I query using “recommend me a Louis Vuitton handbag for women”. Now I know that the data I ingested does contain these products. But, i want the number of recommendations to be fixed not stochastic. If I get two specific Loius Vuitton handbag first time while retrieving, I want those two specifically to be recommended everytime I search for the same mentioned query. Is there any way?

Hmm. The search itself in Weaviate should be deterministic.

But, are you using a generate query? If you are, those use large language models under-the-hood - which are mostly not completely deterministic.

But some model providers let you tune the sampling. So you could make them less stochastic using model parameters such as temperature or top_p.

Does that help?


Thanks @jphwang , but one thing I’m really confused about that everytime the same query is retrieving different products…1-2 might be similar but rest are different everytime…this is before any openai api call, I understand that setting temperature to 0 and top_p to a much lower value and fixing a seed makes the responses deterministic…but this is occuring from weaviate hybrid search level itself…Do you have any advice for this?

@spark - I think that’s unusual. Can you share examples of queries, and results?

If you can include the score metadata in the result as well, that might be helpful to us. (Hybrid search | Weaviate - Vector Database)

Thanks @jphwang , we finally found the reason of stochasticity of product recommendations…the hybrid search of weaviate fully determinsitic what i’ve observed…it’s totally becasue of the non-deterministicity of the generative llm which is applied on top of hybrid search

Thanks for coming back and clarifying that for us :slight_smile: . I’m glad that you got to the bottom of the issue!

1 Like