LLM management

How can SQL queries be enhanced by the potential of LLMs?

How can SQL queries be enhanced by the potential of LLMs?

Managing Your Health Data Through Natural Language with AI generated SQL queries

Developer - Matías Molinolo


Mar 15, 2024



min read

We live in a world where data is everything, but even more important than data, is accessing it in a way that is easy and helpful for those who might not be that familiar with technology and may not know how to use data visualization or querying tools. This is why at Puppeteer, we sought out a way to make us capable of interacting with our data in a way that is totally natural to them: using natural language.

We believe giving people ownership into their health data is extremely valuable to them and making it easier for anyone to access their own data will help them become more proactive regarding their health.

A use case for natural language queries

Let's say you have a tracker that measures your everyday activity and you want a report on how many steps you've taken, calories burned and steps climbed in the last week. This sounds like a pretty complex query - and it is for someone that is not familiar with a query language like SQL. However, harnessing the power of AI, we can just ask exactly what we want to know about our data and it will then generate this query for us, which we can later use to get the correct data from our sources.

After we get the data, possibilities are endless, as we can display it to the user in a plot, have a Large Language Model (LLM) interpret it and give insights to the user in a conversational manner, even if the user is not familiar with the data. The expressive power of LLMs enable us to provide these insights to the user promptly (pun definitely intended).

This use case is also known as Retrieval Augmented Generation, or RAG - here, instead of leveraging text data and a vector database (which we also do for other use cases, by the way), we expand and enhance the model's context with the user's data and a SQL database, which reduces hallucinations and allows us to give the user factual information and insights about their data.

Behind the scenes: Using a SQL Query Chain

This part is a bit more technical, so feel free to skip it or just skim over it if you're not that interested in the technical bits of SQL query generation.

The biggest issue we face is going from natural language to one that has a given structure like SQL, where (most) queries usually go like SELECT ... FROM ... WHERE. But enough with the linguistics and formal language theory. Thankfully, we leverage Langchain's SQL Query Chain and LLMs to do the hard work for us. 

Once we link the chain to the database we want to access, we can give the LLM context into the tables present in the database and a query that will tell the model how columns should be parsed if necessary, how many rows we want to fetch, if the results should be ordered by a column or more, etc.

Also, to improve accuracy when generating queries, we use few-shot prompting to give the model examples of correctly formed queries. In order to fetch the most appropriate examples, we use a vector index to store the queries and select the most similar based on the user's question.

Once we get the SQL query generated, we perform some basic checks to see if the query is correctly formed, if it's not trying to delete our database, insert fake data into it or trying to modify real records.

If all checks pass, then we execute the query and get the data back - easy, right?

Dealing with privacy and security

Throughout this process of generating data queries from natural language, privacy and security need to be ever present, especially since we're dealing with personal health information.

Rest assured, we treat data with extreme care - our databases are encrypted at rest and data is transmitted through secure channels. This means no one except you will be able to read your data.

Leveraging health data with Puppeteer

Building SQL queries and giving users access to their data through natural language is only one of the many features we offer at Puppeteer. We are revolutionizing the industry by bridging the gap between healthcare and generative AI, offering a platform where healthcare companies can construct their own AI agents with unparalleled human-like capabilities.

We make customizable conversational bots which help users become much more engaged in their health. Gone are the days of lengthy paper-based forms or trying to understand if your health metrics are good - just have a quick chat and get your answer pronto!

If you're interested in Puppeteer and its features, don't hesitate to book a demo with one of our founders here!

© 2024 Puppeteer. All rights reserved.

© 2024 Puppeteer. All rights reserved.

© 2024 Puppeteer. All rights reserved.