How to Create a Headless PageAgentCore Instance Without the UI Panel

To create a headless PageAgentCore instance without the UI panel, import the core class directly from @page-agent/core, instantiate a PageController with enableMask: false, and pass it to the PageAgentCore constructor to bypass all browser UI dependencies.

The alibaba/page-agent repository separates its LLM automation engine from the visual interface, allowing you to create a headless PageAgentCore instance for server-side scripts, automated testing, or any environment where browser UI rendering is unnecessary or impossible.

Understanding the Architecture

The codebase splits responsibilities across three distinct components. PageAgentCore (located in packages/core/src/PageAgentCore.ts) contains the LLM reasoning loop, tool dispatch system, observation handling, and conversation history management. The class definition begins at line 60, and its constructor at line 64 receives a pageController parameter that abstracts all DOM interactions.

PageAgent (found in packages/page-agent/src/PageAgent.ts) serves strictly as a convenience wrapper. Its constructor (lines 16-24) instantiates the default controller and attaches the visual Panel UI—this is the only location where UI code enters the system. PageController (in packages/page-controller/src/PageController.ts starting at line 59) provides asynchronous DOM queries and actions; the visual mask is optional and controlled via configuration at lines 97-104.

Creating a Headless Instance

To run the agent programmatically, bypass the PageAgent wrapper and construct the core components manually.

Step 1: Configure the PageController

Import the controller from @page-agent/page-controller and disable the visual mask, as no UI will render the overlay.

import { PageController, type PageControllerConfig } from '@page-agent/page-controller';

const controllerConfig: PageControllerConfig = {
  enableMask: false,  // Disables the visual highlight overlay
};

const pageController = new PageController(controllerConfig);

Step 2: Instantiate PageAgentCore

Import PageAgentCore from @page-agent/core and supply the configured controller along with your LLM configuration.

import { PageAgentCore, type AgentConfig } from '@page-agent/core';

const agentConfig: AgentConfig = {
  llmConfig: { model: 'gpt-4o-mini' },
  onBeforeTask: async (agent) => console.log('Task started'),
  onAfterTask: async (agent, result) => console.log('Task finished', result),
};

const agent = new PageAgentCore({
  ...agentConfig,
  pageController,  // Mandatory dependency injected here
});

Step 3: Execute Tasks Programmatically

Invoke the execute method to run the automation loop. This uses the same engine as the UI version but never creates a panel.

(async () => {
  const result = await agent.execute('Extract the titles of all articles on the page');
  console.log('Success:', result.success);
  console.log('Data:', result.data);
})();

Using Custom Controllers for Non-Browser Environments

For pure Node.js environments where window and document are unavailable, provide a custom implementation satisfying the PageController interface. The core requires only the methods it calls, not the full browser API.

import { PageAgentCore, type AgentConfig, type BrowserState } from '@page-agent/core';
import type { PageController } from '@page-agent/page-controller';

class MockController implements PageController {
  async getBrowserState(): Promise<BrowserState> {
    return {
      url: 'https://example.com',
      title: 'Example',
      header: 'Mock Header',
      content: '<button>Click me</button>',
      footer: 'Mock Footer',
    };
  }
  async showMask() {}
  async hideMask() {}
  async updateTree() { return '<button>Click me</button>'; }
  async clickElement(index: number) { 
    console.log('Clicked', index); 
    return { success: true, message: '' }; 
  }
  async inputText() { return { success: true, message: '' }; }
  async selectOption() { return { success: true, message: '' }; }
  async scroll() { return { success: true, message: '' }; }
  async scrollHorizontally() { return { success: true, message: '' }; }
  async executeJavascript() { return { success: true, message: '' }; }
  async cleanUpHighlights() {}
  dispose() {}
}

const mockController = new MockController();

const agent = new PageAgentCore({
  pageController: mockController as any,
  llmConfig: { model: 'gpt-4o-mini' },
});

(async () => {
  const result = await agent.execute('Click the button');
  console.log(result);
})();

This demonstrates that PageAgentCore depends solely on the controller contract defined in packages/page-controller/src/PageController.ts, not on the UI panel or browser globals.

Summary

Frequently Asked Questions

Does PageAgentCore require a browser environment?

No. While the standard PageController implementation in packages/page-controller/src/PageController.ts interacts with the DOM, PageAgentCore only requires an object satisfying the PageController interface. You can provide a custom implementation for Node.js or testing environments that mocks browser state, as the core automation logic has no direct dependency on window or document.

What is the difference between PageAgent and PageAgentCore?

PageAgentCore (in packages/core/src/PageAgentCore.ts) is the engine handling LLM loops, tool execution, and history management. PageAgent (in packages/page-agent/src/PageAgent.ts) is a high-level wrapper that instantiates the core, creates a default PageController, and attaches the UI panel. To create a headless PageAgentCore instance, import from @page-agent/core directly and skip the wrapper.

How do I disable the visual mask in headless mode?

Pass enableMask: false in the PageControllerConfig when instantiating PageController. According to the source at lines 97-104 of packages/page-controller/src/PageController.ts, this boolean flag prevents the controller from creating or updating the visual highlight overlay, which is unnecessary when running without a UI panel.

Can I use PageAgentCore in Node.js scripts?

Yes. By importing @page-agent/core and providing a controller implementation—either the standard one connected to a browser instance or a custom mock—you can instantiate PageAgentCore in Node.js. The constructor at line 60 of packages/core/src/PageAgentCore.ts accepts an AgentConfig object containing your controller and LLM settings, enabling fully programmatic automation without browser UI overhead.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →