How to Use ComputerTool and AsyncComputer for Browser Automation in openai-agents-python

The ComputerTool class wraps a concrete AsyncComputer implementation to enable LLMs to control real browsers, handling automated navigation, clicking, typing, and screenshot capture through the Responses API.

The openai-agents-python SDK provides first-class support for browser automation via the Computer and AsyncComputer abstractions defined in the source. By implementing these interfaces and wrapping them with ComputerTool, you can equip agents with the ability to interact with live web pages using real browser instances. This guide explains the architecture and provides complete, runnable implementations for both shared and isolated browser sessions.

Architecture Overview

The library separates browser automation into three distinct layers: the abstract driver interface, the tool wrapper, and lifecycle helpers.

Abstract Computer Interfaces

The core abstractions are defined in src/agents/computer.py. The AsyncComputer class defines the asynchronous interface that modern agents require:


# src/agents/computer.py

class AsyncComputer(abc.ABC):
    @abc.abstractmethod
    async def screenshot(self) -> str: ...
    
    @abc.abstractmethod
    async def click(self, x: int, y: int, button: Button) -> None: ...
    
    @abc.abstractmethod
    async def double_click(self, x: int, y: int) -> None: ...
    
    @abc.abstractmethod
    async def scroll(self, x: int, y: int, scroll_x: int, scroll_y: int) -> None: ...
    
    @abc.abstractmethod
    async def type(self, text: str) -> None: ...
    
    @abc.abstractmethod
    async def wait(self) -> None: ...
    
    @abc.abstractmethod
    async def move(self, x: int, y: int) -> None: ...
    
    @abc.abstractmethod
    async def keypress(self, keys: list[str]) -> None: ...
    
    @abc.abstractmethod
    async def drag(self, path: list[tuple[int, int]]) -> None: ...

A synchronous Computer base class exists for legacy implementations, but the Responses API requires async I/O. The library detects which interface your concrete implementation uses and dispatches calls accordingly.

The ComputerTool Wrapper

ComputerTool acts as the bridge between your driver and the LLM. Defined in src/agents/tool.py (lines 85-110), it registers your computer instance or factory and exposes the tool as computer_use_preview to the model:


# src/agents/tool.py

@dataclass(eq=False)
class ComputerTool(Generic[ComputerT]):
    computer: ComputerT | ComputerCreate[ComputerT] | ComputerProvider[ComputerT]
    on_safety_check: Callable[[ComputerToolSafetyCheckData], MaybeAwaitable[bool]] | None = None

    @property
    def name(self):
        return "computer_use_preview"  # Historic name for backward compatibility

    @property
    def trace_name(self):
        return "computer"

The computer parameter accepts three forms:

  • A concrete AsyncComputer instance for singleton usage
  • A ComputerCreate callable that receives RunContextWrapper and returns a computer
  • A ComputerProvider with explicit create and dispose callbacks

Lifecycle Management

The library manages browser instances through resolve_computer and dispose_resolved_computers in src/agents/tool.py (lines 29-84). When the runner encounters a computer_use_preview call, it:

  1. Checks the weak-reference cache for an existing computer instance tied to the current run context
  2. If missing, invokes the registered initializer (factory, provider, or direct instance)
  3. Awaits the result if the initializer is asynchronous
  4. Stores the resolved computer and optional dispose hook
  5. Dispatches the action to the appropriate method (click, type, etc.)

After the agent run completes, dispose_resolved_computers walks the cache, calling any registered cleanup callbacks to prevent leaked browser processes.

Implementing a Concrete Browser Driver

To use ComputerTool, you must implement the AsyncComputer interface. The official example in examples/tools/computer_use.py provides a complete Playwright-based implementation.


# examples/tools/computer_use.py

class LocalPlaywrightComputer(AsyncComputer):
    def __init__(self):
        self._playwright: Playwright | None = None
        self._browser: Browser | None = None
        self._page: Page | None = None

    async def __aenter__(self):
        self._playwright = await async_playwright().start()
        self._browser, self._page = await self._get_browser_and_page()
        return self

    async def __aexit__(self, exc_type, exc_val, exc_tb):
        if self._browser:
            await self._browser.close()
        if self._playwright:
            await self._playwright.stop()

    async def _get_browser_and_page(self) -> tuple[Browser, Page]:
        width, height = self.dimensions
        launch_args = [f"--window-size={width},{height}"]
        browser = await self.playwright.chromium.launch(headless=False, args=launch_args)
        page = await browser.new_page()
        await page.set_viewport_size({"width": width, "height": height})
        await page.goto("https://www.bing.com")
        return browser, page

    async def screenshot(self) -> str:
        png = await self.page.screenshot(full_page=False)
        return base64.b64encode(png).decode("utf-8")

    async def click(self, x: int, y: int, button: Button) -> None:
        await self.page.mouse.click(x, y, button=button)

    async def type(self, text: str) -> None:
        await self.page.keyboard.type(text)

    # ... implement double_click, scroll, wait, move, keypress, drag similarly

Note that LocalPlaywrightComputer implements both AsyncComputer and asynchronous context manager protocols, supporting both singleton and per-request usage patterns.

Usage Patterns

You can configure ComputerTool with either a shared browser instance or a factory that creates fresh browsers for each agent invocation.

Singleton Pattern

Pass a pre-initialized AsyncComputer instance to share one browser across multiple agent runs:

async def singleton_computer():
    async with LocalPlaywrightComputer() as computer:
        agent = Agent(
            name="Browser Assistant",
            instructions="Help the user with web tasks.",
            tools=[ComputerTool(computer=computer)],
            model="gpt-5.4",  # Model bundling built-in computer tool support

        )
        result = await Runner.run(agent, "Find the current weather in Tokyo")
        print(result.final_output)

This pattern maintains a single Playwright process, minimizing startup overhead for sequential tasks.

Per-Request Pattern

Use ComputerProvider to create and dispose of browser instances for each agent execution, ensuring complete isolation between runs:

async def computer_per_request():
    async def create_computer(*, run_context: RunContextWrapper[Any]) -> LocalPlaywrightComputer:
        print(f"Creating browser for context: {run_context}")
        return await LocalPlaywrightComputer().open()

    async def dispose_computer(*, run_context: RunContextWrapper[Any], computer: LocalPlaywrightComputer) -> None:
        print(f"Disposing browser for context: {run_context}")
        await computer.close()

    provider = ComputerProvider[LocalPlaywrightComputer](
        create=create_computer,
        dispose=dispose_computer,
    )
    
    agent = Agent(
        name="Isolated Browser Assistant",
        instructions="Help the user with web tasks.",
        tools=[ComputerTool(computer=provider)],
        model="gpt-5.4",
    )
    result = await Runner.run(agent, "Check the latest news")

The dispose callback ensures resources are cleaned up even if the agent encounters errors.

Running the Examples

First, install Playwright browsers:

uv run python -m playwright install chromium

Execute the singleton example:

uv run -m examples.tools.computer_use singleton

Run the per-request version:

uv run -m examples.tools.computer_use

Both commands launch a visible Chromium window and demonstrate the LLM navigating to search engines and interacting with page elements.

Summary

  • Implement AsyncComputer in src/agents/computer.py to define browser actions like screenshot, click, and type.
  • Wrap implementations with ComputerTool from src/agents/tool.py to expose them to the Responses API as computer_use_preview.
  • Manage lifecycle using ComputerProvider for per-request isolation or pass direct instances for singleton reuse.
  • Reference the Playwright example in examples/tools/computer_use.py for a complete, production-ready driver supporting async context managers.
  • Call dispose_resolved_computers after runs to prevent leaked browser processes when using factory patterns.

Frequently Asked Questions

What is the difference between Computer and AsyncComputer?

Computer defines synchronous methods for legacy implementations, while AsyncComputer provides async def methods required by the modern Responses API and current GPT models. The runner automatically detects which interface your concrete implementation uses, but you should implement AsyncComputer for new code to avoid blocking the event loop during browser I/O.

How does the ComputerTool handle safety checks?

The ComputerTool accepts an optional on_safety_check callback that receives ComputerToolSafetyCheckData and returns a boolean. If the callback returns False, the action is aborted. This allows you to intercept sensitive actions like downloads or navigation to specific domains before they execute in the browser.

Can I use a synchronous Computer implementation with the Responses API?

While the library technically supports synchronous Computer implementations through internal adaptation, the Responses API and modern agents expect asynchronous I/O. Using a sync implementation causes the event loop to block during browser operations, degrading performance. You should migrate to AsyncComputer for all new browser automation tasks.

How do I ensure browser processes are cleaned up after an agent run?

When using the per-request pattern with ComputerProvider, the library automatically calls your dispose callback at the end of the run. For singleton patterns, implement __aexit__ to handle cleanup when the application shuts down. Always ensure dispose_resolved_computers is called if you manually manage run contexts outside the standard Runner flow.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →