Using Natural Language to Query Weaviate (self querying)

There was a great question on the slack channel that I thought other users might benefit from so I’ve moved it here as well. The question is as follows:

“Say in weaviate, I have class Products which has various products like watch, mobile, smart watch etc
Will weaviate be able to answer queries like
a) Cheapest smart watch
b) costliest smart watch
c) Smart watch between 10 $ and 20 $
if yes, please advise me on how can i use weaviate to achieve this”

The way to achieve this with Weaviate is to use the self-querying functionality via LangChain. This basically takes the natural language queries you see above and translates them into equivalent GraphQL queries (using a LLM) that can be run in Weaviate. You can see a tutorial on how to set this up here:

https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/weaviate_self_query.html

6 Likes

I just posted a question related to this. This is what I am prototyping. So langchain converts the NLP string, "Has Greta Gerwig directed any movies about women " into a query like

query='women' filter=Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='director', value='Greta Gerwig') limit=None

What should be the Weaviate schema look like for this data structure langchain example?

  • Weaviate classifies your products with a class named “Products” containing items like watches, mobiles, and smartwatches.
  • LangChain comes into play to understand natural language queries. It acts like a translator, taking questions like “cheapest smartwatch” and converting them into GraphQL queries that Weaviate can interpret.

Here’s a helpful resource to set this up: [Weaviate LangChain Tutorial] (search for “Weaviate tutorial on how to set up LangChain”)

This tutorial will guide you through:

  • Configuring Weaviate with LangChain.
  • Defining the schema for your “Products” class, likely including a property for price.

With LangChain set up, you can then use Weaviate to answer queries like:

  • a) Cheapest smartwatch: This translates to a GraphQL query that filters and sorts “Products” by the price property with “smartwatch” in some way (e.g., exact match or containing the term).
  • b) Costliest smartwatch: Similar to the cheapest query, but sorts in descending order by price.
  • c) Smartwatch between 10$ and 20$: This uses a filter on the price property to find items between a specific range.

Does this mean that once Weaviate is set up with Langchain we can do Natural Language Query out of the box ? Any links or articles for reference ?