Blog

Mastra Text-to-SQL: from natural language to database queries

The Mastra Text-to-SQL demo is an interactive web app that generates and executes SQL queries based on natural language prompts.

As it’s written, it specializes in answering questions about various global city populations on a sample database. You could easily repurpose the example for different use cases, however.

Here’s a look at how it works:

Agent

Text-to-SQL is based off a single agent: a SQL expert for a city population database.

The agent is pre-defined to already understand the database schema, query guidelines key SQL formatting tips, and what its basic workflow is (analyze the user's question about city data, generate an appropriate SQL query, execute the query, and return results).

import { openai } from "@ai-sdk/openai";
import { Agent } from "@mastra/core/agent";
import * as tools from "../tools/population-info";
import { LanguageModelV1 } from "@ai-sdk/provider";

export const sqlAgent = new Agent({
  name: "SQL Agent",
  instructions: ``You are a SQL (PostgreSQL) expert for a city population database. Generate and execute queries that answer user questions about city data.

    DATABASE SCHEMA:
    cities (
      id SERIAL PRIMARY KEY,
      popularity INTEGER,
      geoname_id INTEGER,
      name_en VARCHAR(255),
      country_code VARCHAR(10),
      population BIGINT,
      latitude DECIMAL(10, 6),
      longitude DECIMAL(10, 6),
      country VARCHAR(255),
      region VARCHAR(255),
      continent VARCHAR(255), /* Africa, Asia, Europe, North America, Oceania, South America, Antarctica */
      code2 VARCHAR(10),
      code VARCHAR(10),
      province VARCHAR(255)
    );

    QUERY GUIDELINES:
    - Only retrieval queries are allowed
    - For string comparisons, use: LOWER(field) ILIKE LOWER(term)
    - Use "United Kingdom" for UK and "United States" for USA
    - This dataset contains only current information, not historical data
    - Always return at least two columns for visualization purposes
    - If a user asks for a single column, include a count of that column
    - Format rates as decimals (e.g., 0.1 for 10%)

    Key SQL formatting tips:
    - Start main clauses (SELECT, FROM, WHERE, etc.) on new lines
    - Indent subqueries and complex conditions
    - Align related items (like column lists) for readability
    - Put each JOIN on a new line
    - Use consistent capitalization for SQL keywords

    WORKFLOW:
    1. Analyze the user\'s question about city data
    2. Generate an appropriate SQL query
    3. Execute the query using the Execute SQL Query tool
    4. Return results in markdown format with these sections:

       ### SQL Query
       \`\`\`sql
       [The executed SQL query with proper formatting and line breaks for readability]
       \`\`\`

       ### Explanation
       [Clear explanation of what the query does]

       ### Results
       [Query results in table format]
    ``,
  model: openai("gpt-4o") as LanguageModelV1,
  tools: {
    populationInfo: tools.populationInfo,
  },
);

Tools

Once the agent generates an appropriate SQL query, it uses a tool to execute it.

The populationInfo tool is a PostgreSQL database query executor specifically designed for the cities database:


export const populationInfo = createTool({
  id: "Execute SQL Query",
  inputSchema: z.object({
    query: z
      .string()
      .describe("SQL query to execute against the cities database"),
  }),
  description: `Executes a SQL query against the cities database and returns the results`,
  execute: async ({ context: { query } }) => {
    try {
      const trimmedQuery = query.trim().toLowerCase();
      if (!trimmedQuery.startsWith("select")) {
        throw new Error("Only SELECT queries are allowed for security reasons");
      }

      return await executeQuery(query);
    } catch (error) {
      throw new Error(
        `Failed to execute SQL query: ${error instanceof Error ? error.message : String(error)}`
      );
    }
  },
});

The tool only allows SELECT queries to prevent malicious database modifications. This is enforced through a simple check of the query's starting text.

The tool also implements basic error handling at multiple levels. Find the complete code and follow along here.

And be sure to tell us if you build any new text-to-SQL applications yourself.

Happy building 🚀

Share

Stay up to date