My schema has a Document
and a Passage
class. The document has a name, date, url, and the passage has a text, a vector embedding, and it also refers to one document.
I want to retrieve the top k passages along with some properties from their docs, while ensuring the top only has one passage from a document. Something like,
client.query
.get(
'Passage',
['text', {'document': ['date', 'name']}]
)
.with_near_vector({'vector': vector})
.groupBy({path: 'document', objectsPerGroup: 1})
.with_limit(k)
.do()
This query does not work. Is it possible to do something like this?
Hi @jpiabrantes
The syntax is a bit tricky for groupby for sure. So this query works (I just tried it on our demo instance). You should be able to edit that to suit your purpose.
Does that work?
client = weaviate.Client(
url="https://edu-demo.weaviate.network",
auth_client_secret=weaviate.AuthApiKey(api_key="learn-weaviate"),
additional_headers={
"X-OpenAI-Api-Key": os.environ["OPENAI_APIKEY"],
}
)
response = (
client.query
.get("JeopardyQuestion", ["question", "answer"])
.with_near_text({"concepts": ["space travel"]})
.with_group_by(
groups=5,
properties=["hasCategory"],
objects_per_group=1,
)
.with_limit(10)
.with_additional(
"""
group {
id
count
groupedBy { value path }
maxDistance
minDistance
hits{
question
hasCategory {
... on JeopardyCategory {
_additional {
id
}
}
}
_additional {
id
distance
}
}
}
"""
)
.do()
)
print(response)
1 Like