# How the SQL Lab Query Execution Pipeline Works in Apache Superset: Architecture and Custom Executor Guide

> Understand the Apache Superset SQL Lab query execution pipeline's command executor pattern. Learn how to extend it with custom executors for enhanced functionality.

- Repository: [The Apache Software Foundation/superset](https://github.com/apache/superset)
- Tags: architecture
- Published: 2026-03-03

---

**The SQL Lab query execution pipeline uses a command-executor pattern where `ExecuteSqlCommand` orchestrates query validation, Jinja rendering, and delegates to either a synchronous or asynchronous `SqlJsonExecutor` implementation based on the `runAsync` flag.**

The Apache Superset SQL Lab query execution pipeline is designed around clean separation of concerns, allowing developers to inject custom logic without modifying core internals. Whether you need to audit queries, route them through a sandbox, or integrate a distributed processing engine, understanding this architecture is essential for extending the `apache/superset` codebase.

## Architecture of the SQL Lab Query Execution Pipeline

The pipeline follows a **command pattern** that decouples HTTP handling from business logic and database interaction. Each query passes through context construction, command execution, and result conversion before returning to the frontend.

### Entry Point: REST API and Execution Context

The journey begins at `SqlLabRestApi.execute_sql_query` in [`superset/sqllab/api.py`](https://github.com/apache/superset/blob/main/superset/sqllab/api.py). This POST `/api/v1/sqllab/execute/` endpoint validates the incoming JSON payload and constructs a `SqlJsonExecutionContext` object.

The context object, defined in [`superset/sqllab/sqllab_execution_context.py`](https://github.com/apache/superset/blob/main/superset/sqllab/sqllab_execution_context.py), encapsulates all request metadata: the target database connection, user identity, row limits, CTAS (Create Table As Select) settings, and query parameters. This context travels through the entire pipeline, ensuring all downstream components have access to the original request state.

```python

# superset/sqllab/api.py (lines 495-505, 669-690)

def execute_sql_query(self) -> Response:
    execution_context = self._create_sql_json_command()
    command = ExecuteSqlCommand(
        execution_context=execution_context,
        # ... other injected dependencies

    )
    return command.run()

```

### The Command Pattern: ExecuteSqlCommand

The `ExecuteSqlCommand` class in [`superset/commands/sql_lab/execute.py`](https://github.com/apache/superset/blob/main/superset/commands/sql_lab/execute.py) serves as the central orchestrator. Its `run` method coordinates the entire lifecycle:

1. **Query deduplication** – Checks for existing `Query` records via `_try_get_existing_query`.
2. **Database resolution** – Validates the database connection via `_get_the_query_db`.
3. **Persistence** – Saves the new query record via `_save_new_query`.
4. **Access control** – Validates RBAC permissions via `_validate_access`.
5. **Template rendering** – Processes Jinja syntax via `_sql_query_render.render`.
6. **Limit injection** – Applies row limits via `_set_query_limit_if_required`.
7. **Execution** – Delegates to the selected `SqlJsonExecutor`.

According to the source code in [`execute.py`](https://github.com/apache/superset/blob/main/execute.py) (lines 94-108), the command returns a dictionary containing a `status` key (e.g., `HAS_RESULTS` or `QUERY_IS_RUNNING`) and a `payload` containing the actual data or job tracking information.

### Executor Selection: Synchronous vs Asynchronous

The pipeline supports two execution modes selected in `SqlLabRestApi._create_sql_json_executor` (lines 691-702):

- **Synchronous execution** (`SynchronousSqlJsonExecutor`): Runs the query in-process with a configurable timeout. Best for lightweight queries that return quickly.
- **Asynchronous execution** (`ASynchronousSqlJsonExecutor`): Offloads the work to a Celery worker, returning immediately with a 202 Accepted status. Required when `runAsync` is true or the feature flag `SQLLAB_FORCE_RUN_ASYNC` is enabled.

Both implementations inherit from `SqlJsonExecutorBase` in [`superset/sqllab/sql_json_executer.py`](https://github.com/apache/superset/blob/main/superset/sqllab/sql_json_executer.py) (lines 61-68). They share common error handling logic that translates raw database driver exceptions into Superset's `SupersetError` hierarchy.

### Result Conversion and Response Handling

After execution completes, the `ExecutionContextConvertor` class (defined in [`superset/sqllab/execution_context_convertor.py`](https://github.com/apache/superset/blob/main/superset/sqllab/execution_context_convertor.py)) transforms raw database results into the JSON structure expected by the SQL Lab frontend. This includes applying the `DISPLAY_MAX_ROW` limit to prevent massive payloads from reaching the browser.

The API layer then maps the command's return value to HTTP semantics: **200 OK** for completed queries and **202 Accepted** for asynchronous jobs that are still processing.

## Extending the Pipeline with Custom Executors

Because Superset uses dependency injection for its command and executor objects, you can introduce custom execution logic by implementing the `SqlJsonExecutorBase` interface and registering it in the factory method.

### Implementing a Custom Executor

Create a subclass of `SqlJsonExecutorBase` and implement the `execute` method. The contract requires accepting `(execution_context, rendered_query, log_params)` and returning a `SqlJsonExecutionStatus`. Raise `SupersetErrorException` or `SupersetErrorsException` for known failure conditions.

Here is a minimal example that logs queries to an external audit store before returning dummy data:

```python

# superset/sqllab/custom_executor.py

import logging
from superset.sqllab.sql_json_executer import SqlJsonExecutorBase
from superset.sqllab.command_status import SqlJsonExecutionStatus
from superset.exceptions import SupersetGenericDBErrorException

class LoggingExecutor(SqlJsonExecutorBase):
    """Executes queries through an external audit logger before returning static results."""
    
    def execute(self, execution_context, rendered_query, log_params):
        logger = logging.getLogger(__name__)
        logger.info("Audit log: %s", rendered_query)
        
        # Simulate successful execution

        fake_result = {
            "status": "success",
            "data": {"columns": ["audit_col"], "rows": [[1]]},
            "query_id": execution_context.query.id,
        }
        execution_context.set_execution_result(fake_result)
        return SqlJsonExecutionStatus.HAS_RESULTS

```

### Registering Your Executor in the API Layer

To activate your executor, modify `SqlLabRestApi._create_sql_json_executor` in [`superset/sqllab/api.py`](https://github.com/apache/superset/blob/main/superset/sqllab/api.py). Replace or extend the conditional logic to instantiate your class when specific criteria are met:

```python

# In superset/sqllab/api.py

from superset.sqllab.custom_executor import LoggingExecutor

@staticmethod
def _create_sql_json_executor(
    execution_context: SqlJsonExecutionContext, 
    query_dao: QueryDAO
) -> SqlJsonExecutor:
    # Custom logic: check for a database-specific flag

    if getattr(execution_context.database, "use_logging_executor", False):
        return LoggingExecutor(query_dao, get_sql_results)
    
    # Standard fallback logic

    if execution_context.is_run_asynchronous():
        return ASynchronousSqlJsonExecutor(query_dao, get_sql_results)
    
    return SynchronousSqlJsonExecutor(
        query_dao,
        get_sql_results,
        app.config.get("SQLLAB_TIMEOUT"),
        is_feature_enabled("SQLLAB_BACKEND_PERSISTENCE"),
    )

```

### Triggering Custom Execution via Database Flags

You can control executor selection per-database by storing configuration in the `Database.extra_json` field or a custom column. Access this metadata through `execution_context.database` in the factory method above. When a user submits a query via the standard JSON payload:

```json
{
  "database_id": 5,
  "sql": "SELECT * FROM large_table",
  "runAsync": false,
  "schema": "public"
}

```

If database 5 has `use_logging_executor=True`, the pipeline automatically routes the query through your `LoggingExecutor` instead of the default synchronous or asynchronous implementations.

## Summary

- The **SQL Lab query execution pipeline** in Apache Superset follows a command-executor pattern that cleanly separates HTTP handling, business logic, and database interaction.
- **`ExecuteSqlCommand`** orchestrates the flow, handling validation, Jinja rendering, and limit injection before delegating to an executor.
- **Executor selection** happens in `SqlLabRestApi._create_sql_json_executor`, choosing between `SynchronousSqlJsonExecutor` and `ASynchronousSqlJsonExecutor` based on the `runAsync` flag.
- **Custom executors** must inherit from `SqlJsonExecutorBase`, implement the `execute` method, and be registered in the API factory method to override default behavior.
- The architecture supports per-database routing by checking attributes on `execution_context.database`, enabling targeted extensions without global changes.

## Frequently Asked Questions

### What is the difference between synchronous and asynchronous execution in SQL Lab?

**Synchronous execution** runs queries within the web server process using `SynchronousSqlJsonExecutor`, subject to the `SQLLAB_TIMEOUT` configuration. **Asynchronous execution** uses `ASynchronousSqlJsonExecutor` to dispatch work to a Celery worker via a message broker, returning a 202 status immediately while the query runs in the background. Asynchronous mode is required for long-running queries that might exceed HTTP timeout limits or when the `SQLLAB_FORCE_RUN_ASYNC` feature flag is enabled.

### How does SQL Lab handle Jinja templating before query execution?

Before the executor sends SQL to the database, `ExecuteSqlCommand` calls `_sql_query_render.render` (implemented in [`superset/sqllab/query_render.py`](https://github.com/apache/superset/blob/main/superset/sqllab/query_render.py)) to process Jinja2 syntax. This allows users to reference variables, macros, and cached data within their queries. The rendered query string is then passed to the executor's `execute` method as the `rendered_query` parameter.

### Can I use a custom executor for specific databases only?

Yes. Since `SqlJsonExecutionContext` exposes the `database` object, you can inspect `execution_context.database.extra_json` or custom columns in `_create_sql_json_executor`. Return your custom executor subclass only when specific database flags are present, otherwise fall back to the standard synchronous or asynchronous implementations. This allows you to route, for example, all queries to a specific data warehouse through a custom caching or sandbox layer.

### Where are query results stored during asynchronous execution?

When using `ASynchronousSqlJsonExecutor`, the Celery worker executes the query and stores results temporarily in the Superset results backend (configured via `RESULTS_BACKEND` in your configuration file). The web server returns a query ID to the frontend, which polls for completion. Once finished, the `ExecutionContextConvertor` retrieves the results from the backend, applies the `DISPLAY_MAX_ROW` limit, and streams them to the browser.