How n8n Handles Scaling and Queue-Based Processing with Bull: Architecture Deep Dive

Question

Discover how n8n uses Bull and Redis for robust queue-based processing and horizontal scaling. Learn about automatic recovery and priority handling for efficient workflow execution.

Accepted Answer

n8n leverages Bull queues backed by Redis to enable horizontal scaling, where a main process enqueues workflow executions and worker processes consume them, with automatic recovery, priority handling, and real-time progress reporting via Redis pub-sub. The n8n-io/n8n repository implements a robust queue-mode architecture that separates workflow orchestration from execution. Understanding n8n scaling and queue-based processing with Bull is essential for deploying production-grade automation infrastructure that can handle high-throughput workloads across multiple worker instances while maintaining data consistency and fault tolerance. Configuration: ScalingModeConfig and Redis Settings All queue-related behavior in n8n is driven by the ScalingModeConfig , located in . This configuration class bundles Bull-specific options read from environment variables prefixed with . The BullConfig nested class defines critical parameters including Redis connection details, key prefixes, and stalled-job detection intervals: This configuration abstraction allows n8n to adapt to any Redis deployment topology—whether single-node, cluster, or TLS-secured instances—while standardizing queue behavior across the application. Scaling Service: Bull Queue Management The ScalingService ( ) acts as the central coordinator for queue operations. As a dependency-injected singleton, it creates the Bull queue instance, registers global event listeners, and manages periodic maintenance tasks. Queue Initialization The method dynamically imports Bull and initializes the queue with validated Redis configuration: Event Listeners and Process Communication The service registers distinct listener sets depending on the instance type (main/webhook vs. worker). These listeners handle global:progress events for cross-process communication: - Worker listeners respond to messages to gracefully stop running workflows - Main listeners handle (UI streaming), (HTTP responses), (completion tracking), and (AI tool results) This pub-sub mechanism allows workers to report execution progress back to the main process without direct network calls. Job Processor: Executing Workflows on Workers The JobProcessor ( ) contains the core execution logic that runs on worker instances. When a worker calls , it registers a Bull processor that delegates to . Execution Lifecycle The method orchestrates the following steps: 1. Load execution data from the database using 2. Instantiate the Workflow class with static data, connections, and node types 3. Attach lifecycle hooks that funnel data through calls 4. Execute the workflow via or 5. Report completion via progress messages containing execution results MCP (AI Tool) Integration For executions triggered by Model Context Protocol (MCP) tool calls, the processor sends specialized messages: This enables n8n to integrate with LangChain and other AI frameworks while maintaining the queue-based execution model. Queue Recovery and Metrics The ScalingService implements automatic recovery to handle orphaned executions when worker processes crash or network partitions occur. Recovery Mechanism Running exclusively on the leader main instance, the recovery process: 1. Queries the database for in-progress execution IDs 2. Fetches active and waiting jobs from Bull using 3. Marks any execution present in the database but missing from the queue as crashed Metrics Collection The service emits queue metrics at configurable intervals ( ), exposing active/waiting job counts and completion statistics for Prometheus monitoring. Practical Implementation Examples Enqueuing a Workflow Execution Main processes enqueue jobs via the ScalingService: Graceful Worker Shutdown Workers should handle SIGTERM to finish active jobs before exiting: Environment Configuration Configure Bull behavior through environment variables: Summary - ScalingModeConfig centralizes Bull and Redis configuration, enabling flexible deployments across different infrastructure topologies. - ScalingService manages the Bull queue lifecycle, registers cross-process event listeners, and schedules recovery and metrics tasks. - JobProcessor executes workflows on worker instances, reporting progress via Bull's Redis pub-sub mechanism to the main process. - Automatic recovery identifies and marks crashed executions by comparing database state against active queue jobs. - MCP integration extends the queue system to support AI tool calls while maintaining separation between main and worker processes. Frequently Asked Questions How does n8n handle stalled jobs in Bull queues? n8n configures Bull with and uses the environment variable (defaulting to 30 seconds) to control how frequently Bull checks for stalled jobs. The recovery mechanism in additionally detects executions that have disappeared from the queue entirely and marks them as crashed in the database. Can workers report real-time progress to the n8n UI during queue-based execution? Yes. Workers use Bull's method to send messages that

How n8n Handles Scaling and Queue-Based Processing with Bull: Architecture Deep Dive

Configuration: ScalingModeConfig and Redis Settings

Scaling Service: Bull Queue Management

Queue Initialization

Event Listeners and Process Communication

Job Processor: Executing Workflows on Workers

Execution Lifecycle

MCP (AI Tool) Integration

Queue Recovery and Metrics

Recovery Mechanism

Metrics Collection

Practical Implementation Examples

Enqueuing a Workflow Execution

Graceful Worker Shutdown

Environment Configuration

Summary

Frequently Asked Questions

How does n8n handle stalled jobs in Bull queues?

Can workers report real-time progress to the n8n UI during queue-based execution?

What happens to running jobs when a worker process shuts down?

How does n8n prioritize jobs in the Bull queue?

Have a question about this repo?