How to Use ComputerTool and AsyncComputer for Browser Automation in openai-agents-python
The ComputerTool class wraps a concrete AsyncComputer implementation to enable LLMs to control real browsers, handling automated navigation, clicking, typing, and screenshot capture through the Responses API.
The openai-agents-python SDK provides first-class support for browser automation via the Computer and AsyncComputer abstractions defined in the source. By implementing these interfaces and wrapping them with ComputerTool, you can equip agents with the ability to interact with live web pages using real browser instances. This guide explains the architecture and provides complete, runnable implementations for both shared and isolated browser sessions.
Architecture Overview
The library separates browser automation into three distinct layers: the abstract driver interface, the tool wrapper, and lifecycle helpers.
Abstract Computer Interfaces
The core abstractions are defined in src/agents/computer.py. The AsyncComputer class defines the asynchronous interface that modern agents require:
# src/agents/computer.py
class AsyncComputer(abc.ABC):
@abc.abstractmethod
async def screenshot(self) -> str: ...
@abc.abstractmethod
async def click(self, x: int, y: int, button: Button) -> None: ...
@abc.abstractmethod
async def double_click(self, x: int, y: int) -> None: ...
@abc.abstractmethod
async def scroll(self, x: int, y: int, scroll_x: int, scroll_y: int) -> None: ...
@abc.abstractmethod
async def type(self, text: str) -> None: ...
@abc.abstractmethod
async def wait(self) -> None: ...
@abc.abstractmethod
async def move(self, x: int, y: int) -> None: ...
@abc.abstractmethod
async def keypress(self, keys: list[str]) -> None: ...
@abc.abstractmethod
async def drag(self, path: list[tuple[int, int]]) -> None: ...
A synchronous Computer base class exists for legacy implementations, but the Responses API requires async I/O. The library detects which interface your concrete implementation uses and dispatches calls accordingly.
The ComputerTool Wrapper
ComputerTool acts as the bridge between your driver and the LLM. Defined in src/agents/tool.py (lines 85-110), it registers your computer instance or factory and exposes the tool as computer_use_preview to the model:
# src/agents/tool.py
@dataclass(eq=False)
class ComputerTool(Generic[ComputerT]):
computer: ComputerT | ComputerCreate[ComputerT] | ComputerProvider[ComputerT]
on_safety_check: Callable[[ComputerToolSafetyCheckData], MaybeAwaitable[bool]] | None = None
@property
def name(self):
return "computer_use_preview" # Historic name for backward compatibility
@property
def trace_name(self):
return "computer"
The computer parameter accepts three forms:
- A concrete
AsyncComputerinstance for singleton usage - A
ComputerCreatecallable that receivesRunContextWrapperand returns a computer - A
ComputerProviderwith explicitcreateanddisposecallbacks
Lifecycle Management
The library manages browser instances through resolve_computer and dispose_resolved_computers in src/agents/tool.py (lines 29-84). When the runner encounters a computer_use_preview call, it:
- Checks the weak-reference cache for an existing computer instance tied to the current run context
- If missing, invokes the registered initializer (factory, provider, or direct instance)
- Awaits the result if the initializer is asynchronous
- Stores the resolved computer and optional dispose hook
- Dispatches the action to the appropriate method (
click,type, etc.)
After the agent run completes, dispose_resolved_computers walks the cache, calling any registered cleanup callbacks to prevent leaked browser processes.
Implementing a Concrete Browser Driver
To use ComputerTool, you must implement the AsyncComputer interface. The official example in examples/tools/computer_use.py provides a complete Playwright-based implementation.
# examples/tools/computer_use.py
class LocalPlaywrightComputer(AsyncComputer):
def __init__(self):
self._playwright: Playwright | None = None
self._browser: Browser | None = None
self._page: Page | None = None
async def __aenter__(self):
self._playwright = await async_playwright().start()
self._browser, self._page = await self._get_browser_and_page()
return self
async def __aexit__(self, exc_type, exc_val, exc_tb):
if self._browser:
await self._browser.close()
if self._playwright:
await self._playwright.stop()
async def _get_browser_and_page(self) -> tuple[Browser, Page]:
width, height = self.dimensions
launch_args = [f"--window-size={width},{height}"]
browser = await self.playwright.chromium.launch(headless=False, args=launch_args)
page = await browser.new_page()
await page.set_viewport_size({"width": width, "height": height})
await page.goto("https://www.bing.com")
return browser, page
async def screenshot(self) -> str:
png = await self.page.screenshot(full_page=False)
return base64.b64encode(png).decode("utf-8")
async def click(self, x: int, y: int, button: Button) -> None:
await self.page.mouse.click(x, y, button=button)
async def type(self, text: str) -> None:
await self.page.keyboard.type(text)
# ... implement double_click, scroll, wait, move, keypress, drag similarly
Note that LocalPlaywrightComputer implements both AsyncComputer and asynchronous context manager protocols, supporting both singleton and per-request usage patterns.
Usage Patterns
You can configure ComputerTool with either a shared browser instance or a factory that creates fresh browsers for each agent invocation.
Singleton Pattern
Pass a pre-initialized AsyncComputer instance to share one browser across multiple agent runs:
async def singleton_computer():
async with LocalPlaywrightComputer() as computer:
agent = Agent(
name="Browser Assistant",
instructions="Help the user with web tasks.",
tools=[ComputerTool(computer=computer)],
model="gpt-5.4", # Model bundling built-in computer tool support
)
result = await Runner.run(agent, "Find the current weather in Tokyo")
print(result.final_output)
This pattern maintains a single Playwright process, minimizing startup overhead for sequential tasks.
Per-Request Pattern
Use ComputerProvider to create and dispose of browser instances for each agent execution, ensuring complete isolation between runs:
async def computer_per_request():
async def create_computer(*, run_context: RunContextWrapper[Any]) -> LocalPlaywrightComputer:
print(f"Creating browser for context: {run_context}")
return await LocalPlaywrightComputer().open()
async def dispose_computer(*, run_context: RunContextWrapper[Any], computer: LocalPlaywrightComputer) -> None:
print(f"Disposing browser for context: {run_context}")
await computer.close()
provider = ComputerProvider[LocalPlaywrightComputer](
create=create_computer,
dispose=dispose_computer,
)
agent = Agent(
name="Isolated Browser Assistant",
instructions="Help the user with web tasks.",
tools=[ComputerTool(computer=provider)],
model="gpt-5.4",
)
result = await Runner.run(agent, "Check the latest news")
The dispose callback ensures resources are cleaned up even if the agent encounters errors.
Running the Examples
First, install Playwright browsers:
uv run python -m playwright install chromium
Execute the singleton example:
uv run -m examples.tools.computer_use singleton
Run the per-request version:
uv run -m examples.tools.computer_use
Both commands launch a visible Chromium window and demonstrate the LLM navigating to search engines and interacting with page elements.
Summary
- Implement
AsyncComputerinsrc/agents/computer.pyto define browser actions likescreenshot,click, andtype. - Wrap implementations with
ComputerToolfromsrc/agents/tool.pyto expose them to the Responses API ascomputer_use_preview. - Manage lifecycle using
ComputerProviderfor per-request isolation or pass direct instances for singleton reuse. - Reference the Playwright example in
examples/tools/computer_use.pyfor a complete, production-ready driver supporting async context managers. - Call
dispose_resolved_computersafter runs to prevent leaked browser processes when using factory patterns.
Frequently Asked Questions
What is the difference between Computer and AsyncComputer?
Computer defines synchronous methods for legacy implementations, while AsyncComputer provides async def methods required by the modern Responses API and current GPT models. The runner automatically detects which interface your concrete implementation uses, but you should implement AsyncComputer for new code to avoid blocking the event loop during browser I/O.
How does the ComputerTool handle safety checks?
The ComputerTool accepts an optional on_safety_check callback that receives ComputerToolSafetyCheckData and returns a boolean. If the callback returns False, the action is aborted. This allows you to intercept sensitive actions like downloads or navigation to specific domains before they execute in the browser.
Can I use a synchronous Computer implementation with the Responses API?
While the library technically supports synchronous Computer implementations through internal adaptation, the Responses API and modern agents expect asynchronous I/O. Using a sync implementation causes the event loop to block during browser operations, degrading performance. You should migrate to AsyncComputer for all new browser automation tasks.
How do I ensure browser processes are cleaned up after an agent run?
When using the per-request pattern with ComputerProvider, the library automatically calls your dispose callback at the end of the run. For singleton patterns, implement __aexit__ to handle cleanup when the application shuts down. Always ensure dispose_resolved_computers is called if you manually manage run contexts outside the standard Runner flow.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →