I have Weaviate 1.20.1 running with 2 replicas and 2 shards for a particular class with ~4000000 documents. Doing a dense vector search for weaviate-0 replica skips results with a higher certainty. I am testing each replica by port-forwarding from the pod.
So for example i have two documents DocA /DocB and Query .
certainty of Query & DocA = 0.881561815738678
certainty of Query & DocB = 0.7634845674037933
When i query weaviate-1 to get the top result, it correctly returns DocA for the Query . But when i query weaviate-0 it returns DocB as the closest, even though when i examine DocA in weaviate-0 (by adding the where filter)the certainty is 0.881561815738678 .
Let me know if more details are needed.
Thank you in advanced!
run “docker-compose up -d” to create a 2 instance weaviate
run “python create_schema.py” to create the Article Schema
run “python index.py” to index all documents. This should take a little bit. To speed up, you can bump up the replicas for t2v-transformers if you have more resources.
Thanks @parkerduckworth for taking a look at this, i know this is a difficult one.
I have had to scale down the instance to single node for our production for now to avoid complaints. When I upgrade/recreate the schema in the next few weeks, i will try to use the latest version and see if the issue is still occurring.
Hi @jphwang, could you please clarify my query?
Let’s say. I have created a schema by ingesting into weaviate database. when I’m searching a for a query, everytime it provides different number of recommendations. Although they are correct but I somewhat want to fix those recommendations. Is it possible?
let’s say if I query using “recommend me a Louis Vuitton handbag for women”. Now I know that the data I ingested does contain these products. But, i want the number of recommendations to be fixed not stochastic. If I get two specific Loius Vuitton handbag first time while retrieving, I want those two specifically to be recommended everytime I search for the same mentioned query. Is there any way?
Thanks @jphwang , but one thing I’m really confused about that everytime the same query is retrieving different products…1-2 might be similar but rest are different everytime…this is before any openai api call, I understand that setting temperature to 0 and top_p to a much lower value and fixing a seed makes the responses deterministic…but this is occuring from weaviate hybrid search level itself…Do you have any advice for this?
Thanks @jphwang , we finally found the reason of stochasticity of product recommendations…the hybrid search of weaviate fully determinsitic what i’ve observed…it’s totally becasue of the non-deterministicity of the generative llm which is applied on top of hybrid search