# How to Migrate from browser-use to PageAgent for Client-Side Automation

> Easily migrate from browser use to PageAgent for client-side automation. Update imports, instantiate PageAgent with your config, and leverage the familiar API.

- Repository: [Alibaba/page-agent](https://github.com/alibaba/page-agent)
- Tags: migration-guide
- Published: 2026-03-09

---

**Migrate from browser-use to PageAgent by replacing your import with `@page-agent/page-agent`, instantiating `PageAgent` with the same configuration keys, and using the identical tool API that now delegates to the separated `PageController` layer.**

The **alibaba/page-agent** repository provides **PageAgent** as a modern replacement for the legacy browser-use library, introducing a cleaner architecture that isolates LLM-driven logic from DOM manipulation. When you migrate from browser-use to PageAgent, you gain a modular core, an optional UI panel, and the same high-level tool API without breaking existing automation workflows.

## Architecture Overview

PageAgent restructures browser-use into three distinct layers that communicate through async interfaces. This separation allows you to run headless automation or attach the UI panel as needed.

### Core Layer (`PageAgentCore`)

The **Core** (`@page-agent/core`) manages the LLM-orchestrated loop, tool registration, and the high-level `run` API. In [`packages/core/src/tools/index.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/tools/index.ts), you will find the tool definitions that forward requests to the Page-Controller layer. This layer handles the agent's decision-making without direct DOM manipulation.

### Page-Controller Layer

The **Page-Controller** (`@page-agent/page-controller`) handles pure DOM extraction, indexing, and element actions. The `PageController` class in [`packages/page-controller/src/PageController.ts`](https://github.com/alibaba/page-agent/blob/main/packages/page-controller/src/PageController.ts) provides methods like `clickElement()`, `inputText()`, and `scroll()`. It indexes the page once via `updateTree()`, builds a simplified HTML string, and stores interactive elements in a `selectorMap`. All actions are async and index-based, identical to browser-use.

### UI Panel Layer

The **UI** (`@page-agent/ui`) provides an optional floating panel that displays the LLM's plan, logs, and a stop button. The `Panel` class in [`packages/ui/src/panel/Panel.ts`](https://github.com/alibaba/page-agent/blob/main/packages/ui/src/panel/Panel.ts) integrates with the core. When instantiating `PageAgent` from [`packages/page-agent/src/PageAgent.ts`](https://github.com/alibaba/page-agent/blob/main/packages/page-agent/src/PageAgent.ts), the panel is created automatically and can be shown via `agent.panel.show()`.

## Step-by-Step Migration Guide

Follow these steps to convert your existing browser-use scripts to PageAgent while maintaining the same automation behavior.

### 1. Update Dependencies and Imports

Replace the browser-use package with the PageAgent monorepo packages. The main entry point is `@page-agent/page-agent` for full UI support, or `@page-agent/core` for headless operation.

```bash
npm install @page-agent/core @page-agent/page-controller @page-agent/ui

```

Update your import statements:

```typescript
// Before
import BrowserAgent from 'browser-use'

// After - with UI panel
import { PageAgent } from '@page-agent/page-agent'

// Or headless only
import { PageAgentCore } from '@page-agent/core'

```

### 2. Instantiate the Agent with Existing Configuration

Pass the same configuration fields to the new constructor. The `model`, `baseURL`, `apiKey`, `language`, and optional `enableMask` parameters remain compatible.

```typescript
const agent = new PageAgent({
  model: 'qwen3.5-plus',
  baseURL: 'https://my-llm-endpoint',
  apiKey: process.env.API_KEY,  // Never hard-code credentials
  language: 'en-US',
  enableMask: true,  // Optional visual mask during automation
})

// Optional: Show the UI panel
agent.panel.show()

```

### 3. Replace Direct DOM Access

Browser-use exposed internal tree objects directly. PageAgent encapsulates DOM state through `getBrowserState()`. Update any scripts that accessed `window.browserUse` or internal trees:

```typescript
// Refresh the DOM tree before index-based actions
await agent.pageController.updateTree()

// Access current state instead of direct tree manipulation
const state = await agent.pageController.getBrowserState()
console.log(state.header)   // Title and page info
console.log(state.content)  // Simplified interactive HTML

```

### 4. Verify Tool API Compatibility

The tool names remain identical to browser-use. The [`tools/index.ts`](https://github.com/alibaba/page-agent/blob/main/tools/index.ts) file registers methods like `click_element_by_index`, `input_text`, `scroll`, and `execute_javascript` that forward to `PageController` methods. No changes are required to your high-level tool calls, though internal implementations now route through [`packages/page-controller/src/actions.ts`](https://github.com/alibaba/page-agent/blob/main/packages/page-controller/src/actions.ts).

## Practical Code Examples

These examples demonstrate equivalent implementations between the old and new libraries.

### Basic Click and Input Flow

This pattern replaces direct browser-use element interactions with PageAgent's async controller methods.

```typescript
import { PageAgent } from '@page-agent/page-agent'

const agent = new PageAgent({
  model: 'qwen3.5-plus',
  baseURL: 'https://my-llm.com',
  apiKey: process.env.API_KEY,
  language: 'en-US',
})

agent.panel.show()

// Required: Index the page before element operations
await agent.pageController.updateTree()

// Click first interactive element (index 0)
const clickResult = await agent.pageController.clickElement(0)
console.log(clickResult.message)

// Type into the next input field (index 1)
const inputResult = await agent.pageController.inputText(1, 'Hello PageAgent')
console.log(inputResult.message)

// Scroll down one page
await agent.pageController.scroll({ down: true, numPages: 1 })

```

### High-Level LLM Loop

Replace `browserUse.runPrompt()` with `agent.runPrompt()`. The method internally uses the same registered tools mapped in [`packages/core/src/tools/index.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/tools/index.ts).

```typescript
import { PageAgent } from '@page-agent/page-agent'

const agent = new PageAgent({
  model: 'qwen3.5-plus',
  baseURL: 'https://my-llm.com',
  apiKey: process.env.API_KEY,
  language: 'en-US',
})

await agent.runPrompt(`
  Find the search box on the page, type "Page Agent", press Enter,
  then click the first result.
`)

```

### Headless CI/CD Automation

For server-side testing without UI overhead, use `PageAgentCore` directly from `packages/core/src`.

```typescript
import { PageAgentCore } from '@page-agent/core'

const core = new PageAgentCore({
  model: 'qwen3.5-plus',
  baseURL: 'https://my-llm.com',
  apiKey: process.env.API_KEY,
})

await core.runPrompt('Navigate to the login page and fill the credentials.')

```

## Summary

- **Install** the three packages (`core`, `page-controller`, `ui`) from the alibaba/page-agent monorepo.
- **Import** `PageAgent` for UI-enabled automation or `PageAgentCore` for headless operation.
- **Initialize** with the same configuration object (`model`, `baseURL`, `apiKey`, `language`).
- **Update** DOM access patterns to use `pageController.updateTree()` and `getBrowserState()`.
- **Preserve** existing tool names (`click_element_by_index`, `input_text`, `scroll`, `execute_javascript`) as they map 1:1 to the new architecture.
- **Reference** key source files including [`packages/page-controller/src/PageController.ts`](https://github.com/alibaba/page-agent/blob/main/packages/page-controller/src/PageController.ts) for DOM actions and [`packages/core/src/tools/index.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/tools/index.ts) for tool registration.

## Frequently Asked Questions

### Is the PageAgent API backward compatible with browser-use?

Yes. PageAgent deliberately maintains the same tool names (`click_element_by_index`, `input_text`, `scroll`, `execute_javascript`) and high-level methods like `runPrompt()`. The main differences are architectural: PageAgent separates the LLM core from DOM manipulation via `PageController`, whereas browser-use combined these concerns. You can migrate by changing imports and adding one `updateTree()` call before index-based actions.

### Can I run PageAgent without the UI panel?

Absolutely. Import `PageAgentCore` from `@page-agent/core` instead of `PageAgent` from `@page-agent/page-agent`. This headless mode excludes the `Panel` class defined in [`packages/ui/src/panel/Panel.ts`](https://github.com/alibaba/page-agent/blob/main/packages/ui/src/panel/Panel.ts) and is ideal for CI/CD pipelines or server-side automation where no visual feedback is required.

### How do I access the DOM state in PageAgent?

Call `await agent.pageController.getBrowserState()` after invoking `updateTree()`. This returns an object containing `header` (page title and URL) and `content` (simplified interactive HTML). This replaces browser-use's direct tree object exposure with a cleaner, serializable state snapshot.

### What configuration options are required for migration?

You must provide `model`, `baseURL`, and `apiKey`. Optionally, specify `language` (defaults to browser locale) and `enableMask` (boolean for visual element highlighting). These match browser-use's configuration schema. Refer to [`packages/page-agent/src/demo.ts`](https://github.com/alibaba/page-agent/blob/main/packages/page-agent/src/demo.ts) for auto-initialization examples and advanced configuration parsing.