# How PageAgentCore's Reflection-Before-Action Agent Loop Works Internally

> Discover how PageAgentCore's reflection-before-action agent loop enhances LLM performance. Learn how it evaluates steps, maintains memory, and defines goals before browser actions.

- Repository: [Alibaba/page-agent](https://github.com/alibaba/page-agent)
- Tags: internals
- Published: 2026-03-09

---

**PageAgentCore implements a Re-act (Reflect-Think-Act) loop that forces the LLM to evaluate previous steps, maintain short-term memory, and define the next goal before executing any browser action, using a unified MacroTool that bundles reflection fields with tool selection.**

The **alibaba/page-agent** repository provides a robust browser automation framework powered by a sophisticated agent architecture. At the heart of this system lies `PageAgentCore`, which implements a **reflection-before-action** pattern that ensures the LLM critically assesses its progress and updates its strategy before every browser interaction. This design prevents aimless action sequences and enables adaptive task completion in dynamic web environments.

## The Four-Phase Execution Cycle

The `execute` method in [`packages/core/src/PageAgentCore.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts) drives the main loop (lines **[31‑44](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L31-L44)**). Each iteration follows a strict sequence: observe the environment, assemble context, invoke the LLM with a structured MacroTool, and dispatch the resulting action.

### 1. Observation Collection

Before any reasoning occurs, the agent captures the current browser state. The `#handleObservations` method (lines **[10‑46](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L10-L46)**) collects DOM snapshots, navigation warnings, and custom observations, flushing them into an internal `#observations` buffer. These observations provide the factual grounding for the LLM's subsequent reflection.

### 2. Prompt Assembly and LLM Invocation

With observations gathered, the agent constructs the message payload:

```typescript
const messages = [
  { role: 'system', content: this.#getSystemPrompt() },
  { role: 'user',   content: await this.#assembleUserPrompt() }
];

```

The LLM is then invoked with a forced tool call to the **MacroTool** (lines **[54‑63](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L54-L63)**). Unlike open-ended generation, this pattern requires the model to output a structured object containing both its internal reasoning and the selected action.

### 3. Reflection Extraction

The LLM response is parsed into a `MacroToolResult` containing the reflection data and action choice. The agent extracts three key reflection fields (lines **[68‑74](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L68-L74)**):

```typescript
const reflection: Partial<AgentReflection> = {
  evaluation_previous_goal: input.evaluation_previous_goal,
  memory: input.memory,
  next_goal: input.next_goal,
};

```

Simultaneously, it identifies the concrete action to execute by extracting the first key from the `action` object (lines **[73‑78](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L73-L78)**):

```typescript
const actionName = Object.keys(input.action)[0];
const action = { 
  name: actionName, 
  input: input.action[actionName], 
  output: output 
};

```

### 4. Action Execution and Loop Control

The step is recorded in `this.history` with full context, creating a persistent memory trail. If `actionName` equals `'done'`, the loop terminates and returns the final `ExecutionResult` (lines **[99‑105](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L99-L105)**). Otherwise, the step counter increments and the cycle repeats until the task is complete or the maximum step limit is reached.

## MacroTool Schema Design

The reflection-before-action constraint is enforced structurally through the `#packMacroTool` method (lines **[62‑77](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L62-L77)**). This factory creates a Zod schema that the LLM must populate on every turn.

### Reflection Fields Structure

The MacroTool's input schema requires three optional reflection fields plus a mandatory action (lines **[68‑73](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L68-L73)**):

- **`evaluation_previous_goal`** – LLM's critique of whether the previous step succeeded
- **`memory`** – Short-term notes to retain across steps
- **`next_goal`** – Specific objective for the immediate next action
- **`action`** – A discriminated union of all available tool schemas (click, input_text, wait, etc.)

This design forces the model to narrate its reasoning before accessing effectful browser tools.

### Dynamic Action Union

The schema dynamically aggregates available tools using Zod unions:

```typescript
const actionSchemas = Array.from(tools.entries()).map(([toolName, tool]) => {
  return z.object({ [toolName]: tool.inputSchema }).describe(tool.description);
});
const actionSchema = z.union(actionSchemas as [...]);

```

This ensures the LLM can only select from registered, valid actions while maintaining type safety.

## Tool Execution and Event Emission

When the MacroTool's `execute` function runs, it performs the actual browser automation. The process (lines **[84‑90](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L84-L90)**) involves:

1. **Tool Lookup** – Retrieving the concrete implementation from the `tools` registry
2. **Context Binding** – Executing the tool with `tool.execute.bind(this)`, granting access to `this.pageController`
3. **Activity Emission** – Firing `executing` and `executed` events for observability (lines **[98‑106](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L98-L106)**)

The tool's output is captured and returned as part of the `MacroToolResult`, which populates `action.output` in the history entry for the next iteration's context.

## State Management and History Tracking

All observations collected during `#handleObservations` are flushed into `this.history` after each step (lines **[38‑45](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L38-L45)**). This history array serves as the agent's long-term memory, providing the LLM with a complete record of previous reflections, actions, and outputs when assembling subsequent prompts. The system also tracks remaining steps to prevent infinite loops, emitting warnings as the agent approaches its configured limit.

## Implementation Example

To utilize the reflection-before-action loop in your own project:

```typescript
import { PageAgentCore } from '@page-agent/core';
import { PageController } from '@page-agent/page-controller';

// Initialize the browser controller
const controller = new PageController({ 
  headless: false,
  viewport: { width: 1280, height: 720 }
});

// Configure the agent
const agent = new PageAgentCore({
  pageController: controller,
  maxSteps: 30,           // Safety limit for the loop
  language: 'en',         // Prompt localization
});

// Execute with a natural language goal
agent.execute('Find the pricing page and extract the Enterprise plan cost')
  .then((result) => {
    console.log(`Success: ${result.success}`);
    console.log(`Steps taken: ${result.history.length}`);
    result.history.forEach((step, i) => {
      console.log(`${i + 1}. ${step.reflection.next_goal} → ${step.action.name}`);
    });
  })
  .catch(console.error);

```

This instantiation creates a `PageAgentCore` instance bound to a `PageController`, ready to perform the reflection-before-action cycle until the task completes or the step limit is reached.

## Summary

- **PageAgentCore** implements a strict Re-act loop where reflection precedes every browser action
- The **MacroTool** schema enforces structured output containing `evaluation_previous_goal`, `memory`, `next_goal`, and the selected `action`
- Observations are collected via `#handleObservations` and persisted in a history array that provides long-term context
- Actions are dispatched through a bound tool registry with full access to the `PageController` instance
- The loop terminates when the LLM selects the `done` tool or when the `maxSteps` limit is exceeded

## Frequently Asked Questions

### What is the reflection-before-action pattern in PageAgentCore?

The reflection-before-action pattern requires the LLM to output a structured reflection containing an evaluation of the previous step, short-term memory notes, and the next goal before selecting any browser action. This is enforced by the MacroTool schema in [`packages/core/src/PageAgentCore.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts), ensuring the agent reasons about its progress rather than reacting blindly to the DOM.

### How does the MacroTool enforce structured output from the LLM?

The MacroTool uses a Zod schema defined in `#packMacroTool` (lines **[62‑77](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L62-L77)**) that combines reflection fields with a discriminated union of available actions. When invoking the LLM, PageAgentCore passes this tool definition with `tool_choice: 'required'`, forcing the model to return a JSON object matching the schema rather than freeform text.

### What happens when the agent reaches the maximum step limit?

If the internal step counter exceeds the `maxSteps` configuration (default or user-provided), the agent halts execution and returns an `ExecutionResult` indicating failure due to step exhaustion. During execution, the system emits observations warning that the remaining steps are low, allowing the LLM to prioritize completing the task within the constraint.

### How does PageAgentCore maintain context across multiple steps?

Context persistence is achieved through the `this.history` array, which stores every step's reflection data, action details, and output (lines **[38‑45](https://github.com/alibaba/page-agent/blob/main/packages/core/src/PageAgentCore.ts#L38-L45)**). When assembling the user prompt for subsequent iterations, the agent includes this history, giving the LLM full visibility into previous evaluations and actions taken.