How RAGFlow Implements Python and JavaScript Code Execution for AI Agents

RAGFlow executes arbitrary Python and JavaScript code safely inside agent workflows using a three-layer sandbox architecture that abstracts Docker containers, cloud interpreters, and third-party sandbox providers behind a unified CodeExec component.

The infiniflow/ragflow repository provides a Code Exec agent component that enables AI agents to run code snippets in isolated environments. This implementation supports multiple backend providers—from self-managed Docker containers to managed services like Aliyun Code Interpreter and e2b—while maintaining a consistent interface for agent workflows.

Three-Layer Architecture of the RAGFlow Code Executor

The RAGFlow code executor component is organized into three distinct layers that separate tool logic from runtime implementation details.

Tool Layer: Parsing and Validation

The entry point for code execution resides in agent/tools/code_exec.py, which defines the CodeExec class. This layer handles argument validation, extracts the target language (Python or JavaScript), and prepares the script for execution. The CodeExec._invoke method gathers workflow context variables and passes them to the sandbox layer via _execute_code.

Sandbox Client Layer: Provider Abstraction

The agent/sandbox/client.py module provides the execute_code function, which acts as a provider-agnostic API. This layer initializes the ProviderManager, creates sandbox instances, manages execution timeouts, and ensures proper cleanup. By abstracting provider specifics, the client layer allows the tool layer to remain unchanged when switching between Docker, cloud, or third-party sandboxes.

Provider Layer: Runtime Implementations

Concrete sandbox implementations live in agent/sandbox/providers/. Each provider inherits from the SandboxProvider base class defined in agent/sandbox/providers/base.py:

  • Self-Managed Provider (self_managed.py): Spins up local Docker containers to execute code in isolated environments.
  • Aliyun Code Interpreter (aliyun_codeinterpreter.py): Integrates with Alibaba Cloud's managed code interpreter service.
  • e2b Cloud Provider (e2b.py): Executes code on the e2b SaaS sandbox platform.

The ProviderManager (agent/sandbox/providers/manager.py) maintains the active provider instance based on the system setting sandbox.provider_type.

Execution Flow: From Agent Workflow to Sandboxed Runtime

When an agent workflow invokes the code executor, RAGFlow processes the request through a structured pipeline:

  1. Component Invocation: The agent calls CodeExec._invoke, which extracts the lang (python/javascript), script content, and input arguments from the workflow context.

  2. Sandbox Dispatch: The _execute_code method attempts to use the modern provider system by importing execute_code from agent.sandbox.client. If the ProviderManager is unavailable, it falls back to a direct HTTP POST request to the legacy sandbox endpoint at http://{settings.SANDBOX_HOST}:9385/run.

  3. Provider Selection: The client layer queries get_provider_manager() to read the sandbox.provider_type configuration (e.g., self_managed, aliyun_codeinterpreter, or e2b). The manager instantiates the corresponding provider class.

  4. Instance Lifecycle: The selected provider executes create_instance(template=language) to provision a fresh sandbox container or cloud instance. The code is then submitted via provider.execute_code(code, language, timeout, arguments), which returns an ExecutionResult containing stdout, stderr, exit code, and metadata. Finally, provider.destroy_instance() ensures immediate cleanup.

  5. Result Handling: The tool layer deserializes the raw output via _deserialize_stdout and maps values to the component's declared output keys using _populate_outputs. Execution errors are routed through the special _ERROR output channel.

Security and Resource Isolation

The RAGFlow code executor implements defense-in-depth security through provider-level isolation:

  • Environment Isolation: Each execution runs in a fresh Docker container or cloud-managed VM, ensuring no persistent state between invocations.
  • Resource Limits: Providers enforce timeouts, memory limits, and CPU throttling. The check_if_canceled mechanism allows workflows to abort long-running jobs.
  • Network Restrictions: Self-managed Docker sandboxes can be configured with restricted network access, while cloud providers (Aliyun, e2b) implement their own security boundaries.
  • Input Validation: The CodeExec class validates language types and sanitizes arguments before passing them to the sandbox layer.

Practical Examples: Running Python and JavaScript in RAGFlow Agents

Executing Python in an Agent Workflow

Define a workflow step using the CodeExec component:


# Agent workflow definition

{
    "name": "data_processing_workflow",
    "steps": [
        {
            "type": "CodeExec",
            "params": {
                "lang": "python",
                "script": """
def main():
    import json
    data = {"processed": True, "items": [1, 2, 3]}
    return {"result": json.dumps(data)}
"""
            }
        }
    ]
}

Direct SDK Usage

Invoke the code executor programmatically:

from ragflow import Agent

agent = Agent(...)
code_tool = agent.get_tool("execute_code")

result = code_tool.invoke(
    lang="python",
    script="def main():\n    return {'value': 42}"
)
print(result["value"])  # Output: 42

JavaScript Execution with Arguments

Pass parameters to JavaScript code:

js_code = """
module.exports = { main };
async function main(args) {
    const name = args.name || "world";
    const timestamp = new Date().toISOString();
    return { 
        greeting: `Hello ${name}!`,
        time: timestamp
    };
}
"""

result = code_tool.invoke(
    lang="javascript",
    script=js_code,
    name="RAGFlow"
)
print(result["greeting"])  # Output: Hello RAGFlow!

Summary

  • RAGFlow implements code execution through a three-layer architecture: the Tool layer (agent/tools/code_exec.py), Sandbox Client layer (agent/sandbox/client.py), and Provider layer (agent/sandbox/providers/).
  • The system supports multiple sandbox backends including self-managed Docker, Aliyun Code Interpreter, and e2b cloud sandboxes, configurable via the sandbox.provider_type setting.
  • Execution follows a strict lifecycle: create instance → execute code → destroy instance, ensuring isolation and preventing resource leakage.
  • Both Python and JavaScript are supported languages, with arguments passed via the main function convention and results returned as dictionaries.
  • A legacy HTTP fallback to http://SANDBOX_HOST:9385/run ensures backward compatibility when the provider manager is unavailable.

Frequently Asked Questions

How does RAGFlow ensure code execution security?

RAGFlow isolates code execution through provider-specific sandbox environments. The self-managed provider spins up fresh Docker containers for each invocation, while cloud providers (Aliyun and e2b) use their own isolated VM infrastructure. Each execution runs in a separate instance that is destroyed immediately after completion, preventing persistent access to the filesystem or network. Additionally, the system enforces execution timeouts and supports workflow cancellation to prevent resource exhaustion.

Can I switch between different sandbox providers without changing my agent workflows?

Yes. The RAGFlow architecture decouples the agent workflow definition from the sandbox implementation. You can switch providers by changing the sandbox.provider_type system setting to self_managed, aliyun_codeinterpreter, or e2b. The agent/sandbox/client.py module automatically instantiates the correct provider via ProviderManager, ensuring that existing workflows calling the CodeExec component continue to function without modification.

What is the difference between the modern provider system and the legacy HTTP sandbox?

The modern provider system (introduced in agent/sandbox/client.py) offers a unified interface for multiple sandbox backends through the SandboxProvider abstract base class. It manages the full lifecycle of sandbox instances (create, execute, destroy) and supports cloud providers like Aliyun and e2b. The legacy HTTP sandbox (accessed at http://SANDBOX_HOST:9385/run) is a direct REST endpoint to a single sandbox service, used as a fallback when the ProviderManager is unavailable. The modern system provides better isolation, provider flexibility, and resource management compared to the monolithic legacy approach.

How do I pass arguments to Python or JavaScript code in the CodeExec component?

Arguments are passed to your code through the main function convention. For Python, define def main(): or def main(args): where args is a dictionary containing the parameters passed to the component. For JavaScript, export a main function: async function main(args) { ... }. The return value of the main function becomes the component's output, which is automatically mapped to the workflow context. For example, returning {"value": 42} in Python makes result["value"] available to subsequent workflow steps.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →