Performance Implications of Running Langflow with Gunicorn versus Uvicorn Workers
Gunicorn provides multi-process parallelism for CPU-bound workloads on Linux and macOS while Uvicorn offers a lighter, single-process model on Windows, trading memory efficiency for raw throughput and production-grade process management.
The langflow-ai/langflow repository implements platform-specific server architectures that directly impact scalability and resource consumption. Understanding the performance implications of running Langflow with Gunicorn versus Uvicorn workers is essential for optimizing deployments, as the framework automatically selects Gunicorn process management on Unix-like systems while defaulting to pure Uvicorn on Windows.
Platform-Specific Architecture
Langflow's entry point in src/backend/base/langflow/__main__.py detects the operating system and selects the appropriate server implementation at runtime.
Linux and macOS: Gunicorn Process Management
On non-Windows platforms, Langflow instantiates a LangflowApplication (a Gunicorn BaseApplication subclass) and launches it in a separate process. This architecture enables true parallelism across multiple CPU cores by forking separate worker processes.
The startup logic at lines 93-100 creates the server:
# Use Gunicorn with LangflowUvicornWorker for non-Windows systems
from langflow.server import LangflowApplication
...
server = LangflowApplication(app, options)
process_manager.webapp_process = Process(target=server.run)
process_manager.webapp_process.start()
Windows: Pure Uvicorn Single-Process
On Windows, the startup path calls uvicorn.run(...) directly, executing the FastAPI application in a single process with a single async event loop. This avoids the lack of fork() support on Windows but limits CPU utilization to a single core.
Performance Comparison: Gunicorn vs Uvicorn in Langflow
The choice between worker implementations affects five critical performance characteristics:
| Aspect | Gunicorn + Uvicorn Workers | Pure Uvicorn |
|---|---|---|
| Concurrency Model | Multiple processes, each running an async Uvicorn loop; utilizes separate CPU cores for true parallelism | Single process with one event loop; concurrency limited to one CPU core |
| Scalability | Linear throughput scaling via the workers option (see get_number_of_workers); optimal for multi-core machines |
Requires external orchestration or experimental --workers flag to scale beyond one core |
| Memory Usage | Higher RAM consumption (each worker duplicates the Python interpreter, imported modules, and in-memory caches) | Lower footprint (single interpreter instance shared across all connections) |
| Startup Latency | Slower cold start due to process forking and multiple application loads | Faster initialization (single process startup) |
| Production Resilience | Native graceful reloads, worker timeouts, pre-fork hooks, and robust signal handling | Limited to development-mode reloading; production deployments require external process managers |
Implementation Details
The LangflowUvicornWorker Class
Defined in src/backend/base/langflow/server.py, the custom worker inherits from uvicorn.workers.UvicornWorker and configures the asyncio event loop while adding production-specific logging integration:
class LangflowUvicornWorker(UvicornWorker):
CONFIG_KWARGS = {"loop": "asyncio"}
...
This implementation adds a custom SIGINT handler to suppress noisy shutdown logs and routes Gunicorn access logs through Loguru via the custom Logger class, ensuring consistent log formatting across the application.
Worker Configuration
The get_number_of_workers function determines the default worker count based on available CPU cores, allowing the Gunicorn deployment to automatically scale with the host machine's hardware capabilities.
How to Start Langflow with Each Worker Type
Gunicorn Mode (Linux/macOS Default)
Run the standard Langflow command to automatically select the Gunicorn path:
python -m langflow
Or invoke Gunicorn directly with the custom worker class and explicit worker count:
gunicorn \
--worker-class langflow.server.LangflowUvicornWorker \
--workers $(nproc) \
--bind 0.0.0.0:7860 \
--log-level info \
langflow.app:app
Pure Uvicorn Mode (Windows or Manual Override)
Force single-process mode on any platform by bypassing the automatic detection:
uvicorn langflow.app:app \
--host 0.0.0.0 \
--port 7860 \
--log-level info
Alternatively, set LANGFLOW_FORCE_UVICORN=1 to override the platform check in src/backend/base/langflow/__main__.py before running python -m langflow.
Summary
- Gunicorn with Uvicorn workers delivers superior throughput on multi-core servers through process-level parallelism, but consumes significantly more RAM due to interpreter duplication.
- Pure Uvicorn provides faster startup and lower memory overhead, making it ideal for development environments, Windows deployments, and single-core constrained infrastructure.
- The
LangflowUvicornWorkerclass insrc/backend/base/langflow/server.pycustomizes the standard Uvicorn worker with asyncio loops and integrated Loguru logging for production signal handling. - Platform detection in
src/backend/base/langflow/__main__.pyautomatically selects the optimal architecture, though manual override is possible via environment variables or direct server invocation.
Frequently Asked Questions
Which worker type is best for production deployments of Langflow?
Gunicorn with Uvicorn workers is the recommended choice for production environments running on Linux or macOS. The multi-process architecture provides true parallelism across CPU cores, handles concurrent requests more efficiently under heavy load, and supports graceful worker restarts without dropping active connections.
Why does Langflow use Gunicorn on Linux but not Windows?
The operating system check in src/backend/base/langflow/__main__.py directs Windows installations to pure Uvicorn because Windows lacks support for the fork() system call, which Gunicorn relies on to spawn worker processes efficiently. On Linux and macOS, the fork() mechanism enables robust process management and memory isolation between workers.
How do I configure the number of Gunicorn workers in Langflow?
Pass the --workers flag when starting Langflow, or set the LANGFLOW_WORKERS environment variable. The underlying code calls get_number_of_workers() to determine defaults, typically setting one worker per CPU core. For direct Gunicorn invocation, use --workers $(nproc) to match your machine's core count.
Does using Gunicorn significantly increase memory usage?
Yes. Each Gunicorn worker runs as a separate Python process, duplicating the interpreter state, imported modules, and Langflow's in-memory graph caches. While this isolation improves stability and crash resistance, deployments should plan for memory consumption that scales linearly with the number of configured workers.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →