Basically, we wanted something similar with PostgreSQL where one can combine group by with having clause. For example, I want to find players with duplicate player_code
SELECT player_code
FROM public.player
GROUP BY player_code
HAVING COUNT(player_code) > 1;
How can I achieve such using Weaviate Aggregate? I only came up with this but could not find a way where to insert the WHERE part to compare the meta count greater than 1?
echo '{
"query": "{
Aggregate {
Player(groupBy: [\"player_code\"]) {
meta {
count
}
groupedBy {
value
path
}
}
}
}"
}' | curl \
-X POST \
-H 'Content-Type: application/json' \
-d @- \
http://{{HOST}}/v1/graphql
Is this even possible or not? Thanks!
Hey @junbetterway, I don’t believe there is a way to do this directly inside of Weaviate. However, you can achieve this by parsing the results client side quite nicely, here is an example in python where I am grouping podcasts by speaker and then only keeping the results where the speaker occurs more than 10 times. Hopefully this is useful to you – replacing PodClip
with Player
and speaker
with player_code
import weaviate
client = weaviate.Client("http://localhost:8080")
aggregate_demo = """
{
Aggregate {
PodClip (
groupBy: ["speaker"]
){
groupedBy {
path
value
}
meta {
count
}
}
}
}
"""
results = client.query.raw(aggregate_demo)["data"]["Aggregate"]["PodClip"]
parsed_results = []
for res in results:
print(res.keys())
if res["meta"]["count"] > 10:
parsed_results.append(res)
print(parsed_results)
1 Like
Thanks @CShorten for this - I will try doing it via Java client if there is no native way to do it then.