tutorial

How the sql-queries Skill Generates SQL from Natural Language in phuryn/pm-skills

July 1, 2026 phuryn/pm-skills ↗

The sql-queries skill transforms plain-English requests into optimized, dialect-specific SQL queries through a four-phase LLM-driven workflow that parses database schemas, processes natural language intent, and generates executable code for BigQuery, PostgreSQL, or MySQL.

The sql-queries skill in the phuryn/pm-skills repository is a declarative component designed to bridge the gap between natural language and structured database queries. Unlike traditional SQL generators that rely on rigid templates, this skill leverages Claude's function-calling capabilities to dynamically construct queries based on uploaded database schemas and explicit dialect requirements.

The Four-Phase SQL Generation Architecture

According to the skill definition in [pm-data-analytics/skills/sql-queries/SKILL.md](https://github.com/phuryn/pm-skills/blob/main/pm-data-analytics/skills/sql-queries/SKILL.md), the generation process follows a deterministic workflow split into four distinct phases.

Phase 1: Schema Understanding

The skill begins by parsing uploaded database documentation to build an internal structural model. When you provide a schema file—whether an SQL dump, diagram description, or technical documentation—the skill extracts table names, column data types, primary keys, foreign key relationships, and index information.

This parsing logic occurs in the "Read schema" step (lines 13-17 of SKILL.md), creating a structured context that the LLM references during query construction. Without this foundational step, the generator cannot accurately resolve table relationships or validate column existence.

Phase 2: Request Processing

Next, the skill analyzes your natural language prompt to identify specific data requirements including filters, aggregations, joins, and sorting preferences. The implementation explicitly requires users to state their target SQL dialect—BigQuery, PostgreSQL, MySQL, Snowflake, or others—during this phase.

The "Process Your Request" step (lines 18-22) drives this interaction, using Claude to dissect ambiguous language into structured query components. If critical details are missing—such as date ranges, specific columns, or join conditions—the skill will request clarification before proceeding to generation.

Phase 3: Optimized Query Generation

This is the core transformation step where the LLM synthesizes the schema model and processed intent into executable SQL. The "Generate Optimized Query" step (lines 23-27) produces dialect-specific syntax that respects the unique constraints and features of your chosen database.

For BigQuery, the skill automatically inserts SAFE_CAST, STRUCT, and ARRAY constructs while recommending PARTITION BY or CLUSTER BY clauses for large tables. PostgreSQL generation leverages ANSI-standard syntax with ::type casts and suggests CREATE INDEX statements for performance optimization. MySQL output prefers LIMIT/OFFSET pagination, respects VARCHAR/TEXT nuances, and can emit EXPLAIN hints for query analysis.

Phase 4: Explanation and Validation

The final phase transforms raw SQL into actionable documentation. The "Explain and Test" stage (lines 29-33) returns human-readable explanations of query logic alongside the executable code. The skill suggests validation methods including sample SELECT statements for testing subsets and provides test data generation scripts when requested.

Dialect-Specific Implementation Details

The sql-queries skill contains no hard-coded SQL templates. Instead, it relies on Claude's knowledge base combined with the schema extracted in Phase 1 to enforce dialect-specific rules dynamically.

BigQuery: Emphasizes TIMESTAMP functions, INTERVAL syntax, and partitioned table optimizations
PostgreSQL: Utilizes INTERVAL with explicit time units, ::type casting, and standard JOIN syntax
MySQL: Prioritizes LIMIT clauses, specific TEXT type handling, and procedural hints

This architecture makes the skill immediately adaptable to additional dialects like Snowflake or SQL Server without requiring code modifications.

Practical Code Examples

BigQuery: User Activity Analysis

When provided with a database schema and the request: "Generate a query to find users who signed up in the last 30 days and had at least 5 active sessions", the skill produces:

-- BigQuery version
SELECT u.id, u.email
FROM `myproject.dataset.users` AS u
JOIN `myproject.dataset.sessions` AS s
  ON u.id = s.user_id
WHERE u.created_at >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
  AND s.session_id IS NOT NULL
GROUP BY u.id, u.email
HAVING COUNT(s.session_id) >= 5;

PostgreSQL: Quarterly Aggregation

For the request: "Give me the average order value per customer for the last quarter, using PostgreSQL syntax", the output includes:

-- PostgreSQL version
SELECT c.customer_id,
       AVG(o.total_amount) AS avg_order_value
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= (CURRENT_DATE - INTERVAL '3 months')
GROUP BY c.customer_id
ORDER BY avg_order_value DESC;

MySQL: Inventory Management

The prompt: "Create a MySQL query that lists all products with inventory below 10 units" generates:

-- MySQL version
SELECT p.product_id, p.name, i.quantity
FROM products p
JOIN inventory i ON p.product_id = i.product_id
WHERE i.quantity < 10
ORDER BY i.quantity ASC;

Key Source Files and Repository Structure

The implementation spans several files within the repository:

[pm-data-analytics/skills/sql-queries/SKILL.md](https://github.com/phuryn/pm-skills/blob/main/pm-data-analytics/skills/sql-queries/SKILL.md): Defines the four-phase workflow, dialect support matrix, and system prompts for the LLM
[pm-data-analytics/README.md](https://github.com/phuryn/pm-skills/blob/main/pm-data-analytics/README.md): Lists the sql-queries skill among other data-analytics utilities available in the package
[README.md](https://github.com/phuryn/pm-skills/blob/main/README.md): Provides top-level documentation mentioning BigQuery, PostgreSQL, and MySQL support
pm-data-analytics/commands/write-query.md: Demonstrates CLI invocation patterns for executing the skill

Summary

The sql-queries skill uses a four-phase architecture (Schema Understanding, Request Processing, Query Generation, and Explanation) to convert natural language into SQL
It supports BigQuery, PostgreSQL, MySQL, and Snowflake through dynamic LLM generation rather than static templates
Schema parsing occurs in lines 13-17 of SKILL.md, while query generation logic resides in lines 23-27
Each dialect receives optimized syntax including BigQuery PARTITION BY suggestions, PostgreSQL ::type casts, and MySQL LIMIT clauses
The skill is implemented in the phuryn/pm-skills repository and invoked via commands documented in write-query.md

Frequently Asked Questions

What database schemas does the sql-queries skill support?

The skill accepts SQL dump files, diagram descriptions, and technical documentation to extract table structures, column types, and relationships. As implemented in the "Read schema" step (lines 13-17 of SKILL.md), it parses these inputs to create an internal model that informs the LLM's query construction, ensuring generated SQL references actual tables and columns from your specific database.

How does the skill handle different SQL dialects?

Rather than using hard-coded templates, the skill leverages Claude's function-calling capabilities to enforce dialect-specific constraints dynamically. When you specify BigQuery, PostgreSQL, or MySQL in your request, the LLM applies the appropriate syntax rules—such as BigQuery's TIMESTAMP_SUB, PostgreSQL's INTERVAL syntax, or MySQL's LIMIT clauses—based on its training knowledge combined with the parsed schema context.

Can I use the sql-queries skill without providing a database schema?

While the skill can generate generic SQL without a schema, the output quality and accuracy improve significantly when you provide database documentation. The schema parsing phase (Phase 1) enables the LLM to validate table relationships, identify correct column names, and suggest appropriate indexes or partitioning strategies specific to your data structure.

Where is the sql-queries skill defined in the repository?

The core logic and workflow definition reside in [pm-data-analytics/skills/sql-queries/SKILL.md](https://github.com/phuryn/pm-skills/blob/main/pm-data-analytics/skills/sql-queries/SKILL.md), which outlines the four-phase generation process. Repository-level documentation appears in [pm-data-analytics/README.md](https://github.com/phuryn/pm-skills/blob/main/pm-data-analytics/README.md) and the root [README.md](https://github.com/phuryn/pm-skills/blob/main/README.md), while usage examples are located in pm-data-analytics/commands/write-query.md.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how phuryn/pm-skills works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →