# How to Implement Custom Tools to Extend PageAgent Capabilities

> Extend PageAgent capabilities by defining custom tools with Zod schemas. Learn how to integrate your definitions seamlessly into the agent's tool map for enhanced functionality.

- Repository: [Alibaba/page-agent](https://github.com/alibaba/page-agent)
- Tags: how-to-guide
- Published: 2026-03-09

---

**You can extend PageAgent capabilities by defining a `PageAgentTool` with a Zod schema and passing it in the `customTools` configuration object when instantiating the agent, which merges your definitions into the internal tool map during construction.**

The alibaba/page-agent framework executes browser automation through a modular tool system that the LLM invokes via a reflection-before-action model. To implement custom tools that extend PageAgent capabilities beyond built-in actions like `click_element_by_index` or `execute_javascript`, you leverage the `customTools` configuration API. This approach allows you to add domain-specific functions, override existing behaviors, or remove unnecessary tools without modifying the core agent loop in [`PageAgentCore.ts`](https://github.com/alibaba/page-agent/blob/main/PageAgentCore.ts).

## Understanding PageAgent's Tool Architecture

### The Tool Interface

Every tool in PageAgent conforms to the `PageAgentTool<TParams>` interface defined in [`packages/core/src/tools/index.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/tools/index.ts) (lines 13-18). This structure requires three properties:

- **description**: A natural language explanation of what the tool does
- **inputSchema**: A Zod schema that validates and types the tool's parameters
- **execute**: An async method bound to the `PageAgent` instance that performs the actual work

The `tool` helper function exported from `page-agent` (re-exported from core) provides a type-safe way to create these definitions.

### Tool Registration and the Core Map

Inside [`packages/core/src/PageAgentCore.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts), the constructor initializes a `Map<string, PageAgentTool>` called `tools` (lines 28-30) that holds all available actions. During initialization, the agent clones this map and merges any `customTools` passed via the configuration. The merging logic at lines 31-40 handles both additions and removals:

```typescript
// packages/core/src/PageAgentCore.ts
if (this.config.customTools) {
    for (const [name, tool] of Object.entries(this.config.customTools)) {
        if (tool === null) {
            this.tools.delete(name)          // remove built‑in tool
            continue
        }
        this.tools.set(name, tool)          // add or override tool
    }
}

```

## Creating a Custom Tool Definition

To implement a custom tool, import the `tool` helper and define your function using a Zod schema for input validation. The `execute` method receives the validated parameters and has access to the agent instance via `this`.

```typescript
// src/customTools.ts
import { tool } from 'page-agent'
import { z } from 'zod/v4'
import type { PageAgent } from 'page-agent'

export const customTools = {
  fetch_and_summarize: tool({
    description:
      'Fetch a JSON endpoint and return a brief summary of the data.',
    inputSchema: z.object({
      url: z.string().url(),
      maxItems: z.number().int().min(1).max(20).default(5),
    }),
    // `this` is bound to the PageAgent instance
    async execute(this: PageAgent, { url, maxItems }) {
      const response = await fetch(url)
      const data = await response.json()
      const items = Array.isArray(data) ? data.slice(0, maxItems) : [data]
      return `✅ Fetched ${items.length} item(s) from ${url}.`
    },
  }),
}

```

This definition follows the `PageAgentTool` type from [`packages/core/src/tools/index.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/tools/index.ts) and satisfies the `customTools` property declared in [`packages/core/src/types.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/types.ts) (lines 20-49).

## Registering Custom Tools via Configuration

Pass your custom tool definitions through the `AgentConfig` interface when constructing a `PageAgent` or `PageAgentCore` instance. The high-level `PageAgent` class in [`packages/page-agent/src/PageAgent.ts`](https://github.com/alibaba/page-agent/blob/main/packages/page-agent/src/PageAgent.ts) extends `PageAgentCore` and accepts the same configuration options.

```typescript
// src/runAgent.ts
import { PageAgent } from 'page-agent'
import { customTools } from './customTools'

const agent = new PageAgent({
  llmConfig: {
    model: 'gpt-4o-mini',
    apiKey: process.env.OPENAI_API_KEY!,
  },
  // Register the custom tools
  customTools,
  // Optional: enable experimental features if needed
  experimentalScriptExecutionTool: true,
})

// Execute a task that uses the new tool
await agent.execute(`
  Please fetch the latest 3 posts from https://jsonplaceholder.typicode.com/posts
  and summarize their titles.
`)

```

## Removing or Overriding Built-in Tools

The `customTools` object supports two additional operations beyond adding new capabilities. To remove a built-in tool, assign `null` to its name. To override a built-in tool, provide a new definition using the same key.

```typescript
export const customTools = {
  // Add new capability
  fetch_and_summarize: tool({ /* ... */ }),
  
  // Remove built-in ask_user tool
  ask_user: null,
  
  // Override the default wait behavior
  wait: tool({
    description: 'Wait for a specified duration with custom logging',
    inputSchema: z.object({ seconds: z.number() }),
    async execute(this: PageAgent, { seconds }) {
      console.log(`Custom wait: ${seconds}s`)
      await new Promise(r => setTimeout(r, seconds * 1000))
      return `Waited ${seconds} seconds`
    },
  }),
}

```

## Runtime Execution and LLM Integration

During each step, the LLM calls a macro-tool (`AgentOutput`) that selects an action from the final merged `tools` map. The `#packMacroTool` method (lines 80-86 in [`PageAgentCore.ts`](https://github.com/alibaba/page-agent/blob/main/PageAgentCore.ts)) extracts the `action.toolName` and dispatches to your concrete tool's `execute` method.

When the LLM decides to use your custom tool, it outputs JSON like:

```json
{
  "action": {
    "fetch_and_summarize": {
      "url": "https://jsonplaceholder.typicode.com/posts",
      "maxItems": 3
    }
  },
  "evaluation_previous_goal": "Previous step completed",
  "memory": "Need to fetch posts",
  "next_goal": "Analyze the fetched data"
}

```

The core extracts `fetch_and_summarize`, validates inputs against your Zod schema, and executes your method, returning the string result to the LLM context.

## Summary

- **Tool Definition**: Create `PageAgentTool` objects using the `tool` helper with Zod schemas in [`packages/core/src/tools/index.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/tools/index.ts)
- **Configuration**: Pass tools via `customTools` in `AgentConfig` when instantiating `PageAgent` or `PageAgentCore`
- **Merging Logic**: The constructor in [`packages/core/src/PageAgentCore.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts) (lines 31-40) clones the internal map and merges your definitions, supporting addition, override, and deletion via `null`
- **Execution**: Custom tools are immediately available to the LLM through the `AgentOutput` macro-tool dispatch system
- **Context Access**: Tool `execute` methods are bound to the agent instance, providing access to the browser page and agent state via `this`

## Frequently Asked Questions

### Can I override existing built-in tools like `click_element_by_index`?

Yes. If you provide a tool definition in `customTools` using the same name as a built-in tool, your definition will replace the original in the internal `tools` Map. This occurs during the merge loop in [`PageAgentCore.ts`](https://github.com/alibaba/page-agent/blob/main/PageAgentCore.ts) where `this.tools.set(name, tool)` overwrites any existing entry.

### What schema validation library does PageAgent require for tool inputs?

PageAgent uses **Zod** (specifically version 4 as `zod/v4`) for input validation. The `inputSchema` property of `PageAgentTool` expects a Zod schema object. The agent validates LLM outputs against this schema before passing them to your `execute` method.

### Do custom tools have access to the browser page and agent state?

Yes. The `execute` function is bound to the `PageAgent` instance at runtime, so `this` refers to the agent itself. You can access the underlying Playwright page via `this.page` (or equivalent properties on the agent instance) to perform browser operations within your custom tool logic.

### How do I remove a tool so the LLM cannot use it?

Set the tool name to `null` in your `customTools` configuration object. During construction, `PageAgentCore` checks for `null` values and calls `this.tools.delete(name)`, effectively removing that capability from the agent's available actions before any task execution begins.