How the sql-queries Skill Generates SQL from Natural Language in phuryn/pm-skills
The sql-queries skill transforms plain-English requests into optimized, dialect-specific SQL queries through a four-phase LLM-driven workflow that parses database schemas, processes natural language intent, and generates executable code for BigQuery, PostgreSQL, or MySQL.
The sql-queries skill in the phuryn/pm-skills repository is a declarative component designed to bridge the gap between natural language and structured database queries. Unlike traditional SQL generators that rely on rigid templates, this skill leverages Claude's function-calling capabilities to dynamically construct queries based on uploaded database schemas and explicit dialect requirements.
The Four-Phase SQL Generation Architecture
According to the skill definition in [pm-data-analytics/skills/sql-queries/SKILL.md](https://github.com/phuryn/pm-skills/blob/main/pm-data-analytics/skills/sql-queries/SKILL.md), the generation process follows a deterministic workflow split into four distinct phases.
Phase 1: Schema Understanding
The skill begins by parsing uploaded database documentation to build an internal structural model. When you provide a schema file—whether an SQL dump, diagram description, or technical documentation—the skill extracts table names, column data types, primary keys, foreign key relationships, and index information.
This parsing logic occurs in the "Read schema" step (lines 13-17 of SKILL.md), creating a structured context that the LLM references during query construction. Without this foundational step, the generator cannot accurately resolve table relationships or validate column existence.
Phase 2: Request Processing
Next, the skill analyzes your natural language prompt to identify specific data requirements including filters, aggregations, joins, and sorting preferences. The implementation explicitly requires users to state their target SQL dialect—BigQuery, PostgreSQL, MySQL, Snowflake, or others—during this phase.
The "Process Your Request" step (lines 18-22) drives this interaction, using Claude to dissect ambiguous language into structured query components. If critical details are missing—such as date ranges, specific columns, or join conditions—the skill will request clarification before proceeding to generation.
Phase 3: Optimized Query Generation
This is the core transformation step where the LLM synthesizes the schema model and processed intent into executable SQL. The "Generate Optimized Query" step (lines 23-27) produces dialect-specific syntax that respects the unique constraints and features of your chosen database.
For BigQuery, the skill automatically inserts SAFE_CAST, STRUCT, and ARRAY constructs while recommending PARTITION BY or CLUSTER BY clauses for large tables. PostgreSQL generation leverages ANSI-standard syntax with ::type casts and suggests CREATE INDEX statements for performance optimization. MySQL output prefers LIMIT/OFFSET pagination, respects VARCHAR/TEXT nuances, and can emit EXPLAIN hints for query analysis.
Phase 4: Explanation and Validation
The final phase transforms raw SQL into actionable documentation. The "Explain and Test" stage (lines 29-33) returns human-readable explanations of query logic alongside the executable code. The skill suggests validation methods including sample SELECT statements for testing subsets and provides test data generation scripts when requested.
Dialect-Specific Implementation Details
The sql-queries skill contains no hard-coded SQL templates. Instead, it relies on Claude's knowledge base combined with the schema extracted in Phase 1 to enforce dialect-specific rules dynamically.
- BigQuery: Emphasizes
TIMESTAMPfunctions,INTERVALsyntax, and partitioned table optimizations - PostgreSQL: Utilizes
INTERVALwith explicit time units,::typecasting, and standardJOINsyntax - MySQL: Prioritizes
LIMITclauses, specificTEXTtype handling, and procedural hints
This architecture makes the skill immediately adaptable to additional dialects like Snowflake or SQL Server without requiring code modifications.
Practical Code Examples
BigQuery: User Activity Analysis
When provided with a database schema and the request: "Generate a query to find users who signed up in the last 30 days and had at least 5 active sessions", the skill produces:
-- BigQuery version
SELECT u.id, u.email
FROM `myproject.dataset.users` AS u
JOIN `myproject.dataset.sessions` AS s
ON u.id = s.user_id
WHERE u.created_at >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
AND s.session_id IS NOT NULL
GROUP BY u.id, u.email
HAVING COUNT(s.session_id) >= 5;
PostgreSQL: Quarterly Aggregation
For the request: "Give me the average order value per customer for the last quarter, using PostgreSQL syntax", the output includes:
-- PostgreSQL version
SELECT c.customer_id,
AVG(o.total_amount) AS avg_order_value
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= (CURRENT_DATE - INTERVAL '3 months')
GROUP BY c.customer_id
ORDER BY avg_order_value DESC;
MySQL: Inventory Management
The prompt: "Create a MySQL query that lists all products with inventory below 10 units" generates:
-- MySQL version
SELECT p.product_id, p.name, i.quantity
FROM products p
JOIN inventory i ON p.product_id = i.product_id
WHERE i.quantity < 10
ORDER BY i.quantity ASC;
Key Source Files and Repository Structure
The implementation spans several files within the repository:
- [
pm-data-analytics/skills/sql-queries/SKILL.md](https://github.com/phuryn/pm-skills/blob/main/pm-data-analytics/skills/sql-queries/SKILL.md): Defines the four-phase workflow, dialect support matrix, and system prompts for the LLM - [
pm-data-analytics/README.md](https://github.com/phuryn/pm-skills/blob/main/pm-data-analytics/README.md): Lists thesql-queriesskill among other data-analytics utilities available in the package - [
README.md](https://github.com/phuryn/pm-skills/blob/main/README.md): Provides top-level documentation mentioning BigQuery, PostgreSQL, and MySQL support pm-data-analytics/commands/write-query.md: Demonstrates CLI invocation patterns for executing the skill
Summary
- The
sql-queriesskill uses a four-phase architecture (Schema Understanding, Request Processing, Query Generation, and Explanation) to convert natural language into SQL - It supports BigQuery, PostgreSQL, MySQL, and Snowflake through dynamic LLM generation rather than static templates
- Schema parsing occurs in lines 13-17 of
SKILL.md, while query generation logic resides in lines 23-27 - Each dialect receives optimized syntax including BigQuery
PARTITION BYsuggestions, PostgreSQL::typecasts, and MySQLLIMITclauses - The skill is implemented in the phuryn/pm-skills repository and invoked via commands documented in
write-query.md
Frequently Asked Questions
What database schemas does the sql-queries skill support?
The skill accepts SQL dump files, diagram descriptions, and technical documentation to extract table structures, column types, and relationships. As implemented in the "Read schema" step (lines 13-17 of SKILL.md), it parses these inputs to create an internal model that informs the LLM's query construction, ensuring generated SQL references actual tables and columns from your specific database.
How does the skill handle different SQL dialects?
Rather than using hard-coded templates, the skill leverages Claude's function-calling capabilities to enforce dialect-specific constraints dynamically. When you specify BigQuery, PostgreSQL, or MySQL in your request, the LLM applies the appropriate syntax rules—such as BigQuery's TIMESTAMP_SUB, PostgreSQL's INTERVAL syntax, or MySQL's LIMIT clauses—based on its training knowledge combined with the parsed schema context.
Can I use the sql-queries skill without providing a database schema?
While the skill can generate generic SQL without a schema, the output quality and accuracy improve significantly when you provide database documentation. The schema parsing phase (Phase 1) enables the LLM to validate table relationships, identify correct column names, and suggest appropriate indexes or partitioning strategies specific to your data structure.
Where is the sql-queries skill defined in the repository?
The core logic and workflow definition reside in [pm-data-analytics/skills/sql-queries/SKILL.md](https://github.com/phuryn/pm-skills/blob/main/pm-data-analytics/skills/sql-queries/SKILL.md), which outlines the four-phase generation process. Repository-level documentation appears in [pm-data-analytics/README.md](https://github.com/phuryn/pm-skills/blob/main/pm-data-analytics/README.md) and the root [README.md](https://github.com/phuryn/pm-skills/blob/main/README.md), while usage examples are located in pm-data-analytics/commands/write-query.md.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →