# How RAGFlow Implements Python and JavaScript Code Execution for AI Agents

> Discover how RAGFlow securely executes Python and JavaScript code for AI agents using a novel three-layer sandbox architecture. Learn about its unified CodeExec component.

- Repository: [InfiniFlow/ragflow](https://github.com/infiniflow/ragflow)
- Tags: internals
- Published: 2026-02-23

---

**RAGFlow executes arbitrary Python and JavaScript code safely inside agent workflows using a three-layer sandbox architecture that abstracts Docker containers, cloud interpreters, and third-party sandbox providers behind a unified `CodeExec` component.**

The `infiniflow/ragflow` repository provides a **Code Exec** agent component that enables AI agents to run code snippets in isolated environments. This implementation supports multiple backend providers—from self-managed Docker containers to managed services like Aliyun Code Interpreter and e2b—while maintaining a consistent interface for agent workflows.

## Three-Layer Architecture of the RAGFlow Code Executor

The RAGFlow code executor component is organized into three distinct layers that separate tool logic from runtime implementation details.

### Tool Layer: Parsing and Validation

The entry point for code execution resides in [`agent/tools/code_exec.py`](https://github.com/infiniflow/ragflow/blob/main/agent/tools/code_exec.py), which defines the `CodeExec` class. This layer handles argument validation, extracts the target language (Python or JavaScript), and prepares the script for execution. The `CodeExec._invoke` method gathers workflow context variables and passes them to the sandbox layer via `_execute_code`.

### Sandbox Client Layer: Provider Abstraction

The [`agent/sandbox/client.py`](https://github.com/infiniflow/ragflow/blob/main/agent/sandbox/client.py) module provides the `execute_code` function, which acts as a provider-agnostic API. This layer initializes the `ProviderManager`, creates sandbox instances, manages execution timeouts, and ensures proper cleanup. By abstracting provider specifics, the client layer allows the tool layer to remain unchanged when switching between Docker, cloud, or third-party sandboxes.

### Provider Layer: Runtime Implementations

Concrete sandbox implementations live in `agent/sandbox/providers/`. Each provider inherits from the `SandboxProvider` base class defined in [`agent/sandbox/providers/base.py`](https://github.com/infiniflow/ragflow/blob/main/agent/sandbox/providers/base.py):

- **Self-Managed Provider** ([`self_managed.py`](https://github.com/infiniflow/ragflow/blob/main/self_managed.py)): Spins up local Docker containers to execute code in isolated environments.
- **Aliyun Code Interpreter** ([`aliyun_codeinterpreter.py`](https://github.com/infiniflow/ragflow/blob/main/aliyun_codeinterpreter.py)): Integrates with Alibaba Cloud's managed code interpreter service.
- **e2b Cloud Provider** ([`e2b.py`](https://github.com/infiniflow/ragflow/blob/main/e2b.py)): Executes code on the e2b SaaS sandbox platform.

The `ProviderManager` ([`agent/sandbox/providers/manager.py`](https://github.com/infiniflow/ragflow/blob/main/agent/sandbox/providers/manager.py)) maintains the active provider instance based on the system setting `sandbox.provider_type`.

## Execution Flow: From Agent Workflow to Sandboxed Runtime

When an agent workflow invokes the code executor, RAGFlow processes the request through a structured pipeline:

1. **Component Invocation**: The agent calls `CodeExec._invoke`, which extracts the `lang` (python/javascript), `script` content, and input arguments from the workflow context.

2. **Sandbox Dispatch**: The `_execute_code` method attempts to use the modern provider system by importing `execute_code` from `agent.sandbox.client`. If the `ProviderManager` is unavailable, it falls back to a direct HTTP POST request to the legacy sandbox endpoint at `http://{settings.SANDBOX_HOST}:9385/run`.

3. **Provider Selection**: The client layer queries `get_provider_manager()` to read the `sandbox.provider_type` configuration (e.g., `self_managed`, `aliyun_codeinterpreter`, or `e2b`). The manager instantiates the corresponding provider class.

4. **Instance Lifecycle**: The selected provider executes `create_instance(template=language)` to provision a fresh sandbox container or cloud instance. The code is then submitted via `provider.execute_code(code, language, timeout, arguments)`, which returns an `ExecutionResult` containing stdout, stderr, exit code, and metadata. Finally, `provider.destroy_instance()` ensures immediate cleanup.

5. **Result Handling**: The tool layer deserializes the raw output via `_deserialize_stdout` and maps values to the component's declared output keys using `_populate_outputs`. Execution errors are routed through the special `_ERROR` output channel.

## Security and Resource Isolation

The RAGFlow code executor implements defense-in-depth security through provider-level isolation:

- **Environment Isolation**: Each execution runs in a fresh Docker container or cloud-managed VM, ensuring no persistent state between invocations.
- **Resource Limits**: Providers enforce timeouts, memory limits, and CPU throttling. The `check_if_canceled` mechanism allows workflows to abort long-running jobs.
- **Network Restrictions**: Self-managed Docker sandboxes can be configured with restricted network access, while cloud providers (Aliyun, e2b) implement their own security boundaries.
- **Input Validation**: The `CodeExec` class validates language types and sanitizes arguments before passing them to the sandbox layer.

## Practical Examples: Running Python and JavaScript in RAGFlow Agents

### Executing Python in an Agent Workflow

Define a workflow step using the CodeExec component:

```python

# Agent workflow definition

{
    "name": "data_processing_workflow",
    "steps": [
        {
            "type": "CodeExec",
            "params": {
                "lang": "python",
                "script": """
def main():
    import json
    data = {"processed": True, "items": [1, 2, 3]}
    return {"result": json.dumps(data)}
"""
            }
        }
    ]
}

```

### Direct SDK Usage

Invoke the code executor programmatically:

```python
from ragflow import Agent

agent = Agent(...)
code_tool = agent.get_tool("execute_code")

result = code_tool.invoke(
    lang="python",
    script="def main():\n    return {'value': 42}"
)
print(result["value"])  # Output: 42

```

### JavaScript Execution with Arguments

Pass parameters to JavaScript code:

```python
js_code = """
module.exports = { main };
async function main(args) {
    const name = args.name || "world";
    const timestamp = new Date().toISOString();
    return { 
        greeting: `Hello ${name}!`,
        time: timestamp
    };
}
"""

result = code_tool.invoke(
    lang="javascript",
    script=js_code,
    name="RAGFlow"
)
print(result["greeting"])  # Output: Hello RAGFlow!

```

## Summary

- RAGFlow implements code execution through a **three-layer architecture**: the Tool layer ([`agent/tools/code_exec.py`](https://github.com/infiniflow/ragflow/blob/main/agent/tools/code_exec.py)), Sandbox Client layer ([`agent/sandbox/client.py`](https://github.com/infiniflow/ragflow/blob/main/agent/sandbox/client.py)), and Provider layer (`agent/sandbox/providers/`).
- The system supports **multiple sandbox backends** including self-managed Docker, Aliyun Code Interpreter, and e2b cloud sandboxes, configurable via the `sandbox.provider_type` setting.
- Execution follows a strict lifecycle: **create instance → execute code → destroy instance**, ensuring isolation and preventing resource leakage.
- Both **Python and JavaScript** are supported languages, with arguments passed via the `main` function convention and results returned as dictionaries.
- A **legacy HTTP fallback** to `http://SANDBOX_HOST:9385/run` ensures backward compatibility when the provider manager is unavailable.

## Frequently Asked Questions

### How does RAGFlow ensure code execution security?

RAGFlow isolates code execution through provider-specific sandbox environments. The self-managed provider spins up fresh Docker containers for each invocation, while cloud providers (Aliyun and e2b) use their own isolated VM infrastructure. Each execution runs in a separate instance that is destroyed immediately after completion, preventing persistent access to the filesystem or network. Additionally, the system enforces execution timeouts and supports workflow cancellation to prevent resource exhaustion.

### Can I switch between different sandbox providers without changing my agent workflows?

Yes. The RAGFlow architecture decouples the agent workflow definition from the sandbox implementation. You can switch providers by changing the `sandbox.provider_type` system setting to `self_managed`, `aliyun_codeinterpreter`, or `e2b`. The [`agent/sandbox/client.py`](https://github.com/infiniflow/ragflow/blob/main/agent/sandbox/client.py) module automatically instantiates the correct provider via `ProviderManager`, ensuring that existing workflows calling the `CodeExec` component continue to function without modification.

### What is the difference between the modern provider system and the legacy HTTP sandbox?

The modern provider system (introduced in [`agent/sandbox/client.py`](https://github.com/infiniflow/ragflow/blob/main/agent/sandbox/client.py)) offers a unified interface for multiple sandbox backends through the `SandboxProvider` abstract base class. It manages the full lifecycle of sandbox instances (create, execute, destroy) and supports cloud providers like Aliyun and e2b. The legacy HTTP sandbox (accessed at `http://SANDBOX_HOST:9385/run`) is a direct REST endpoint to a single sandbox service, used as a fallback when the `ProviderManager` is unavailable. The modern system provides better isolation, provider flexibility, and resource management compared to the monolithic legacy approach.

### How do I pass arguments to Python or JavaScript code in the CodeExec component?

Arguments are passed to your code through the `main` function convention. For Python, define `def main():` or `def main(args):` where `args` is a dictionary containing the parameters passed to the component. For JavaScript, export a `main` function: `async function main(args) { ... }`. The return value of the `main` function becomes the component's output, which is automatically mapped to the workflow context. For example, returning `{"value": 42}` in Python makes `result["value"]` available to subsequent workflow steps.