# How the extract_content Tool Handles Dynamic Content and Infinite Scroll Pages

> Learn how the extract_content tool efficiently handles dynamic content and infinite scroll pages by rebuilding a flat DOM snapshot on agent interaction commands for precise data capture.

- Repository: [Alibaba/page-agent](https://github.com/alibaba/page-agent)
- Tags: how-to-guide
- Published: 2026-03-09

---

**The extract_content workflow rebuilds a flat DOM snapshot via `PageController.updateTree()` whenever the agent issues scroll or interaction commands, requiring explicit re-extraction to capture dynamic or infinitely-scrolled content rather than using automatic detection.**

The `extract_content` capability in the alibaba/page-agent repository enables LLM agents to perceive web pages by serializing the DOM into a simplified representation. Unlike automatic monitoring systems, this tool employs a snapshot-based architecture that handles dynamic JavaScript updates and infinite scroll patterns through explicit agent commands. Understanding how extract_content manages these scenarios requires examining the `PageController` implementation and its DOM tree construction mechanics.

## The extract_content Architecture

The extract_content functionality is not a standalone function but rather a **DOM extraction workflow** orchestrated by `PageController.updateTree()`. According to the source code in [[`packages/page-controller/src/PageController.ts`](https://github.com/alibaba/page-agent/blob/main/packages/page-controller/src/PageController.ts)](https://github.com/alibaba/page-agent/blob/main/packages/page-controller/src/PageController.ts) (lines 71-78), this method constructs a flat DOM tree that powers the agent's page perception.

When the LLM requests the browser state, `PageController.getBrowserState()` invokes `updateTree()` to perform three critical operations:

1. **Tree Construction**: Builds a flat DOM tree representation of the current page state
2. **HTML Serialization**: Converts the tree into a simplified HTML string stored in `this.simplifiedHTML`
3. **Element Mapping**: Creates an `elementTextMap` that indexes elements for interaction targeting

The extraction scope respects the **`VIEWPORT_EXPANSION`** setting defined in [[`packages/page-controller/src/constants.ts`](https://github.com/alibaba/page-agent/blob/main/packages/page-controller/src/constants.ts)](https://github.com/alibaba/page-agent/blob/main/packages/page-controller/src/constants.ts) (lines 15-17). By default, this value is `-1`, indicating full-page extraction rather than viewport-limited capture.

## Handling Dynamic Content Updates

The framework does **not implement automatic DOM monitoring** for dynamic content. After JavaScript operations modify the page—such as AJAX loads, lazy-loaded components, or reactive framework updates—the previously extracted DOM becomes stale.

To capture these changes, the agent must explicitly trigger re-extraction. The `updateTree()` method runs again when the agent invokes specific tools that internally call it, including:

- **`click_element_by_index`**
- **`scroll`**
- Any other interaction tool that modifies page state

This design requires agents to recognize when content has changed and proactively request a fresh snapshot, ensuring the `simplifiedHTML` accurately reflects the current DOM state.

## Managing Infinite Scroll Pages

For infinite scroll implementations, the extractor only processes nodes present in the DOM at the moment of extraction. The alibaba/page-agent handles paginated scrolling through a **scroll-then-re-extract** pattern rather than automatic detection.

### The Scroll Tool Mechanism

The scroll functionality is defined in [[`packages/core/src/tools/index.ts`](https://github.com/alibaba/page-agent/blob/main/packages/core/src/tools/index.ts)](https://github.com/alibaba/page-agent/blob/main/packages/core/src/tools/index.ts) (lines 31-41) and forwards requests to `PageController.scroll()`. This method ultimately delegates to **`scrollVertically`** or **`scrollHorizontally`** depending on the scroll direction parameters.

### Scroll-Then-Re-extract Workflow

To handle infinite scroll pages effectively:

1. **Initial Extraction**: Capture the first batch of content via `updateTree()`
2. **Scroll Execution**: Use the `scroll` tool to scroll the page or specific scrollable element, triggering the infinite scroll loader
3. **Re-extraction**: Invoke `updateTree()` again (or request browser state) to capture newly appended DOM nodes in the updated `simplifiedHTML`

This manual loop ensures the agent only processes actually rendered content, preventing hallucinations about elements that exist in the page logic but not yet in the DOM.

## Technical Considerations

### Mask Overlay Handling

During the extraction process, `updateTree()` temporarily disables the visual **mask overlay** if enabled (see lines 80-84 in [`PageController.ts`](https://github.com/alibaba/page-agent/blob/main/PageController.ts)). This prevents pointer-event restrictions from blocking the `elementFromPoint` calls used during tree construction, ensuring accurate element indexing and coordinate mapping.

### Viewport Configuration

While the default `VIEWPORT_EXPANSION` of `-1` captures the full page, agents can configure this setting to limit extraction to specific viewport segments. However, for infinite scroll scenarios, full-page extraction is typically required to capture all loaded content batches.

## Practical Implementation Example

The following pattern demonstrates the scroll-then-extract loop for infinite scroll pages:

```typescript
// Initial extraction
const initialState = await pageController.getBrowserState();
// Process first batch of items...

// Scroll to trigger infinite scroll loading
await pageController.scroll({ direction: 'down', amount: 800 });

// Re-extract to capture newly loaded content
const updatedState = await pageController.updateTree();
// Process additional items from updatedState.simplifiedHTML

```

## Summary

- **extract_content is a snapshot workflow**, not a continuous stream, implemented via `PageController.updateTree()` in [`PageController.ts`](https://github.com/alibaba/page-agent/blob/main/PageController.ts)
- **Dynamic content requires explicit refresh**; the framework does not auto-update the DOM representation after JavaScript changes
- **Infinite scroll uses scroll-then-re-extract loops**, utilizing the `scroll` tool defined in [`tools/index.ts`](https://github.com/alibaba/page-agent/blob/main/tools/index.ts) followed by `updateTree()` calls
- **Full-page extraction is default** via `VIEWPORT_EXPANSION: -1` in [`constants.ts`](https://github.com/alibaba/page-agent/blob/main/constants.ts)
- **Mask overlays are temporarily disabled** during extraction (lines 80-84) to ensure accurate `elementFromPoint` execution

## Frequently Asked Questions

### Does extract_content automatically detect when new content loads via AJAX?

No, extract_content does not implement automatic detection for AJAX or dynamic content updates. The `PageController` maintains a static snapshot created by `updateTree()`. After JavaScript modifies the DOM, the agent must explicitly invoke `updateTree()` again or call `getBrowserState()` to refresh the simplified HTML representation.

### How do I handle infinite scroll pages with the extract_content tool?

Handle infinite scroll by executing a **scroll-then-re-extract** sequence. First, use the `scroll` tool (which calls `PageController.scroll()` and delegates to `scrollVertically`) to load more content. Then, call `updateTree()` or request the browser state again to capture the newly appended DOM nodes in the updated extraction.

### What is the VIEWPORT_EXPANSION setting and how does it affect extraction?

**`VIEWPORT_EXPANSION`** is a configuration constant defined in [`packages/page-controller/src/constants.ts`](https://github.com/alibaba/page-agent/blob/main/packages/page-controller/src/constants.ts) (lines 15-17) that controls the extraction scope. The default value of `-1` enables full-page extraction, while other values can limit the snapshot to specific viewport areas. For infinite scroll pages, maintaining the default full-page setting ensures all loaded content is captured.

### Why does the extraction process disable mask overlays temporarily?

During `updateTree()` execution (lines 80-84 in [`PageController.ts`](https://github.com/alibaba/page-agent/blob/main/PageController.ts)), the controller temporarily disables visual mask overlays to prevent interference with `elementFromPoint` calls. This ensures accurate element identification and coordinate mapping during DOM tree construction, particularly when interactive elements are layered under visual overlays.