RAG for text-2-SQL

Hey weaviate community. I was came across some questions in different github repos about RAG on text-2-SQL(or text-2-Code in general). I have the following questions:

  1. Is RAG used in text-2-SQL? Is yes, how?
  2. What is the context that we want to fetch in those use cases?
  3. How do we evaluate an RAG system in text-2-SQL use case?

My idea for point-3: Ask an LLM to generate a sample data from the retrieved context, run the ground-truth SQL and generated SQL on that data and compare result.

Would like to know in-depth views on this. Also any resources, suggestions or materials are welcome.

Note: Posted in slack as well. Posted here on suggestion from @Marion_Nehring on weaviate slack.

Hi!

I have seen something along those lines here:

So you have a natural language question that is translated to a sql instruction, it then retrieves the data, and answer with it.

Another thing to look for is this:

That instead of DBs, it can learn how to consume APIs. This opens up a lot of cool things.

Anyway, just my 2 cents :slight_smile:

1 Like

The questions you ask are not really answerable given all the unknowns. But, with reference to your idea:

All of this is technically possible: User submits data, a cosine similarity search is ran on your vector store, context documents returned, the original prompt with documents is submitted to the LLM.

Here is where this is all going to come down to your prompts. What exactly are you instructing the LLM to do? You will need to give it precise instructions on not only how to generate the sample data, but also how to submit it as SQL to the 2nd part of the system.

I don’t know if you intend to use function/action calls or simply pass the initial results as another prompt to be processed separately. In any event, the LLM that has to create the SQL will need to know:

  1. Your existing database schema in detail.
  2. How to submit data – how it should be formatted?
  3. What to do and what NOT to do.
  4. What to do when it doesn’t know what to do.

Here are some notes I have made on the general subject:

1 Like