# How to Use SandboxAgent for Long-Running Tasks with Filesystem Access in openai-agents-python

> Safely run long-running tasks with filesystem access using SandboxAgent in openai-agents-python. Learn to configure Manifest, WorkspaceShellCapability, and SandboxRunConfig for powerful execution.

- Repository: [OpenAI/openai-agents-python](https://github.com/openai/openai-agents-python)
- Tags: how-to-guide
- Published: 2026-04-17

---

**Use `SandboxAgent` with a `Manifest` defining your filesystem, attach a `WorkspaceShellCapability` for PTY-based command execution, and configure a `SandboxRunConfig` with a backend like `UnixLocalSandboxClient` or `DockerSandboxClient` to safely run long-running processes with full filesystem access.**

The `openai-agents-python` SDK provides a secure, isolated environment for running autonomous AI workflows. By leveraging `SandboxAgent` for long-running tasks with filesystem access, you can enable LLMs to execute persistent shell commands, compile code, and manipulate files without exposing the host system to unbounded operations.

## What is SandboxAgent?

**`SandboxAgent`** is a specialized `Agent` implementation designed to run model-driven workflows inside an isolated sandbox environment. Unlike standard agents that operate only through function calling, `SandboxAgent` integrates deeply with a **sandbox session** that provides:

- A **virtual filesystem** defined by a `Manifest`
- **Process execution** via `exec` for one-shot commands
- **Long-running process streaming** via `pty_exec_start` for interactive or continuous output
- **Auditing and tracing** of every sandbox operation

The agent resides in [`src/agents/sandbox/sandbox_agent.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/sandbox_agent.py) and serves as the entry point for filesystem-aware, long-running automation tasks.

## Core Components for Filesystem and Long-Running Tasks

To implement `SandboxAgent` effectively, you must configure three core components: the filesystem manifest, the session execution layer, and the shell capability that exposes these operations to the LLM.

### Manifest: Defining the Filesystem Workspace

The **`Manifest`** class (located in [`src/agents/sandbox/manifest.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/manifest.py)) provides a declarative description of the sandbox's **filesystem state**, including files, directories, environment variables, and mount points. When you initialize a `SandboxAgent` with a `default_manifest`, the sandbox session mounts this virtual file tree, allowing the LLM to read, write, and manipulate paths under `/workspace` without touching the host filesystem.

### SandboxSession: Executing Long-Running Commands

The **`SandboxSession`** class (in [`src/agents/sandbox/session/sandbox_session.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/session/sandbox_session.py)) wraps all sandbox operations with **audit events, tracing, and PTY support**. For long-running tasks, two methods are critical:

- **`SandboxSession.exec`** (lines 94-101): Executes one-shot shell commands with stdout/stderr capture.
- **`SandboxSession.pty_exec_start`** (lines 124-134): Starts a **pseudo-terminal session** for interactive or streaming processes, enabling the LLM to monitor continuous output from builds, servers, or log tails.

Both methods emit `SandboxSessionStartEvent` and `SandboxSessionFinishEvent` for full observability, implemented in the session's `_annotate` method (lines 161-200).

### WorkspaceShellCapability: Exposing Tools to the LLM

To allow the model to invoke sandbox operations, you must attach a **capability** that registers the relevant tools. The **`WorkspaceShellCapability`** (found in [`examples/sandbox/misc/workspace_shell.py`](https://github.com/openai/openai-agents-python/blob/main/examples/sandbox/misc/workspace_shell.py)) bundles the shell tool, exposing `exec` and `pty_exec_start` as function tools that the LLM can call during its reasoning loop.

## Implementing Long-Running Tasks with Filesystem Access

Below are practical implementations showing how to configure `SandboxAgent` for different scenarios involving persistent processes and file manipulation.

### Running a Background Process with PTY Streaming

This example demonstrates starting a long-running command, monitoring its output, and terminating it programmatically:

```python
import asyncio
from pathlib import Path

from agents import Runner
from agents.run import RunConfig
from agents.sandbox import SandboxAgent, SandboxRunConfig
from agents.sandbox.sandboxes.unix_local import UnixLocalSandboxClient
from agents.sandbox.capabilities.tools.shell_tool import WorkspaceShellCapability
from agents.sandbox.manifest import Manifest

async def main() -> None:
    # Create an empty workspace manifest

    manifest = Manifest()

    # Configure the sandbox agent with shell capability

    agent = SandboxAgent(
        name="Long-run worker",
        model="gpt-4o-mini",
        instructions="You can run any command. For long-running processes use `pty_exec_start`.",
        default_manifest=manifest,
        capabilities=[WorkspaceShellCapability()],
    )

    # Execute with the Unix local sandbox client

    result = await Runner.run(
        agent,
        "Start a background `yes` command that prints forever; then stop it after 3 seconds.",
        run_config=RunConfig(
            sandbox=SandboxRunConfig(client=UnixLocalSandboxClient())
        ),
    )

    print("Final LLM output:")
    print(result.final_output)

asyncio.run(main())

```

Under the hood, when the LLM invokes `pty_exec_start`, `SandboxSession.pty_exec_start` (lines 124-134 in [`sandbox_session.py`](https://github.com/openai/openai-agents-python/blob/main/sandbox_session.py)) forwards the request to the `UnixLocalSandboxClient`, creating a pseudo-terminal session that streams output back to the model until terminated.

### Editing Files and Running a Build Process

This example combines filesystem manipulation with a long-running compilation task:

```python
import asyncio
from pathlib import Path

from agents import Runner, function_tool
from agents.mcp import MCPServerStdio
from agents.run import RunConfig
from agents.sandbox import SandboxAgent, SandboxRunConfig
from agents.sandbox.sandboxes.unix_local import UnixLocalSandboxClient
from examples.sandbox.misc.workspace_shell import WorkspaceShellCapability
from examples.sandbox.misc.example_support import text_manifest

@function_tool
def get_build_flags() -> str:
    """Return compiler flags that the LLM can use."""
    return "-O2 -Wall"

async def main() -> None:
    # Create a manifest with initial source code

    manifest = text_manifest(
        {
            "main.c": """
            #include <stdio.h>
            int main() { printf("Hello\\n"); return 0; }
            """
        }
    )

    # Configure agent with both filesystem and shell access

    agent = SandboxAgent(
        name="C-builder",
        model="gpt-4o-mini",
        instructions=(
            "You can edit source files, run `gcc` to build, and stream the build output. "
            "Always call `get_build_flags` to obtain compiler options before building."
        ),
        default_manifest=manifest,
        tools=[get_build_flags],
        capabilities=[WorkspaceShellCapability()],
    )

    # Run the build process

    result = await Runner.run(
        agent,
        "Patch main.c to print 'Hi there' and compile it with gcc, streaming the output.",
        run_config=RunConfig(sandbox=SandboxRunConfig(client=UnixLocalSandboxClient())),
    )

    print("\n=== Final answer ===")
    print(result.final_output)

asyncio.run(main())

```

Here, the LLM uses the sandbox's file APIs (routed through `SandboxSession`) to read and modify [`main.c`](https://github.com/openai/openai-agents-python/blob/main/main.c), then invokes `pty_exec_start` via the `WorkspaceShellCapability` to stream the `gcc` compilation output in real-time.

### Using Docker for Enhanced Isolation

For production workloads requiring stronger isolation than the local Unix client provides, swap the backend to `DockerSandboxClient`:

```python
from agents.sandbox.sandboxes.docker import DockerSandboxClient

# Configure Docker backend

run_config=RunConfig(
    sandbox=SandboxRunConfig(client=DockerSandboxClient(image="python:3.12-slim"))
)

```

The same `SandboxAgent` code works unchanged across backends because `SandboxSession` abstracts the transport layer, calling `exec` and `pty_exec_start` on whichever `SandboxClient` implementation you provide.

## Key Source Files and Architecture

Understanding the architecture helps debug and extend sandbox behavior. These files define the core implementation:

| File | Role |
|------|------|
| [`src/agents/sandbox/sandbox_agent.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/sandbox_agent.py) | Public `SandboxAgent` class that glues manifest, capabilities, and tools together. |
| [`src/agents/sandbox/session/sandbox_session.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/session/sandbox_session.py) | Core session implementation – wraps all sandbox ops with audit events, tracing, and PTY support at lines 94-101 (`exec`) and 124-134 (`pty_exec_start`). |
| [`src/agents/sandbox/manifest.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/manifest.py) | Declarative description of the sandbox's filesystem, environment vars, and mount points. |
| [`src/agents/sandbox/sandboxes/unix_local.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/sandboxes/unix_local.py) | Simple local sandbox client used in most examples; runs commands directly on the host with isolation through process namespaces. |
| [`src/agents/sandbox/capabilities/tools/shell_tool.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/capabilities/tools/shell_tool.py) | Implements the `exec` and `pty_exec_start` tools exposed to the model via the shell capability. |
| [`examples/sandbox/misc/workspace_shell.py`](https://github.com/openai/openai-agents-python/blob/main/examples/sandbox/misc/workspace_shell.py) | Example capability bundling the shell tool for workspace operations. |

## Summary

- **SandboxAgent** provides an isolated environment for LLM-driven workflows with full filesystem and process execution capabilities.
- **Manifest** declaratively defines the filesystem workspace available to the agent, configured via `default_manifest`.
- **Long-running tasks** require the **PTY execution method** (`pty_exec_start`) rather than one-shot `exec`, accessible through `WorkspaceShellCapability`.
- **SandboxRunConfig** determines the backend isolation level, with `UnixLocalSandboxClient` for local development and `DockerSandboxClient` for production isolation.
- All operations are traced through `SandboxSession` events, providing full auditability of file and process interactions.

## Frequently Asked Questions

### What is the difference between `exec` and `pty_exec_start` in SandboxAgent?

**`exec`** (implemented in [`src/agents/sandbox/session/sandbox_session.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/session/sandbox_session.py) lines 94-101) runs a one-shot command and captures stdout/stderr after completion, suitable for quick queries like `ls` or `cat`. **`pty_exec_start`** (lines 124-134) creates a pseudo-terminal session that streams output continuously, making it essential for long-running processes like builds, servers, or log monitoring that produce output over time.

### How do I persist files between SandboxAgent runs?

Files persist within the **Manifest** definition for the duration of a single run, but the `Manifest` itself is declarative and constructed at runtime. To persist files across separate agent invocations, you must either define the file contents in your code when constructing the `Manifest` (using `text_manifest` or file references), or mount external storage volumes via the Manifest's mount configuration. The actual filesystem state is maintained by the `SandboxClient` backend (e.g., Docker volumes for `DockerSandboxClient`).

### Can I use SandboxAgent with custom tools alongside filesystem access?

Yes, `SandboxAgent` accepts both **capabilities** (like `WorkspaceShellCapability`) and standard **tools** simultaneously. The `tools` parameter accepts Python functions decorated with `@function_tool`, while `capabilities` accept objects that expose sandbox-specific operations. In the article's second example, the agent uses `get_build_flags` (a custom function tool) alongside the shell capability to compile code, demonstrating how filesystem operations and custom logic work together.

### Which sandbox backend should I choose for production?

For **development and testing**, use `UnixLocalSandboxClient` (from [`src/agents/sandbox/sandboxes/unix_local.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/sandboxes/unix_local.py)) which runs commands directly on the host with minimal overhead. For **production** requiring stronger isolation, use `DockerSandboxClient` (from [`src/agents/sandbox/sandboxes/docker.py`](https://github.com/openai/openai-agents-python/blob/main/src/agents/sandbox/sandboxes/docker.py)) with a specific container image (e.g., `python:3.12-slim`), ensuring that long-running processes and filesystem modifications are contained within the Docker environment and cannot affect the host system.