Cross-reference queries

My schema has a Document and a Passage class. The document has a name, date, url, and the passage has a text, a vector embedding, and it also refers to one document.

I want to retrieve the top k passages along with some properties from their docs, while ensuring the top only has one passage from a document. Something like,

client.query
    .get(
        'Passage',
        ['text', {'document': ['date', 'name']}]
    )
    .with_near_vector({'vector': vector})
    .groupBy({path: 'document', objectsPerGroup: 1})
    .with_limit(k)
    .do()

This query does not work. Is it possible to do something like this?

Hi @jpiabrantes

The syntax is a bit tricky for groupby for sure. So this query works (I just tried it on our demo instance). You should be able to edit that to suit your purpose.

Does that work?

client = weaviate.Client(
    url="https://edu-demo.weaviate.network",
    auth_client_secret=weaviate.AuthApiKey(api_key="learn-weaviate"),
    additional_headers={
        "X-OpenAI-Api-Key": os.environ["OPENAI_APIKEY"],
    }
)

response = (
    client.query
    .get("JeopardyQuestion", ["question", "answer"])
    .with_near_text({"concepts": ["space travel"]})
    .with_group_by(
        groups=5,
        properties=["hasCategory"],
        objects_per_group=1,
    )
    .with_limit(10)
    .with_additional(
        """
        group {
          id
          count
          groupedBy { value path }
          maxDistance
          minDistance
          hits{
            question
            hasCategory {
              ... on JeopardyCategory {
                _additional {
                  id
                }
              }
            }
            _additional {
              id
              distance
            }
          }
        }
        """
    )
    .do()
)

print(response)
1 Like