internals

How the Page-Agent LLM Client Handles Retries and Error Recovery

March 9, 2026 alibaba/page-agent ↗

The LLM client in the @page-agent/llms package implements a resilient retry mechanism through the withRetry helper, which respects abort signals, classifies errors via the InvokeError class, and emits DOM events to notify UI components of retry attempts and final failures.

The alibaba/page-agent repository provides a modular LLM integration layer where the LLM class serves as a thin wrapper around concrete clients like OpenAIClient. Understanding how this package handles transient failures is critical for building robust agent applications that can recover from network issues, rate limits, and model-level errors without manual intervention.

The Three-Component Retry Architecture

The retry strategy relies on three cooperating systems defined across the package:

Configuration (constants.ts): Supplies the default maxRetries value (2) and model parameters through parseLLMConfig in packages/llms/src/index.ts
Error Classification (errors.ts): The InvokeError class determines retry eligibility via its isRetryable method (lines 24-57)
Retry Driver (index.ts): The withRetry async helper executes the request loop and manages backoff timing (lines 76-112)

The Request Flow and Retry Loop

When consuming code calls LLM.invoke() in packages/llms/src/index.ts, the method delegates execution to the withRetry helper.

Wrapping the Client Invocation

The invoke method constructs an anonymous async function that checks abort signals before delegating to the underlying client:

async invoke(
  messages: Message[],
  tools: Record<string, Tool>,
  abortSignal: AbortSignal,
  options?: InvokeOptions
): Promise<InvokeResult> {
  return await withRetry(
    async () => {
      if (abortSignal.aborted) throw new Error('AbortError');
      return await this.client.invoke(messages, tools, abortSignal, options);
    },
    {
      maxRetries: this.config.maxRetries,
      onRetry: (attempt) => this.dispatchEvent(
        new CustomEvent('retry', { detail: { attempt, maxAttempts: this.config.maxRetries } })
      ),
      onError: (error) => this.dispatchEvent(
        new CustomEvent('error', { detail: { error } })
      ),
    }
  );
}

The withRetry Implementation

The withRetry function implements a while-loop that attempts the operation up to maxRetries + 1 times:

async function withRetry<T>(fn: () => Promise<T>, settings: { … }): Promise<T> {
  let attempt = 0;
  let lastError: Error | null = null;
  while (attempt <= settings.maxRetries) {
    if (attempt > 0) {
      settings.onRetry(attempt);
      await new Promise(r => setTimeout(r, 100));
    }
    try {
      return await fn();
    } catch (error: unknown) {
      if ((error as any)?.rawError?.name === 'AbortError') throw error;

      console.error(error);
      settings.onError(error as Error);

      if (error instanceof InvokeError && !error.retryable) throw error;

      lastError = error as Error;
      attempt++;
      await new Promise(r => setTimeout(r, 100));
    }
  }
  throw lastError!;
}

Key behaviors include:

Immediate abort: If the error name is AbortError, the loop exits immediately without consuming retry attempts
Fixed backoff: A 100ms delay is inserted before each retry attempt via setTimeout
Final error propagation: After exhausting all retries, the last captured error is re-thrown to the caller

Error Classification Logic

The InvokeError class in packages/llms/src/errors.ts encapsulates error metadata and determines whether a failure warrants another attempt.

The isRetryable Method

The private isRetryable method returns false for explicit aborts and true for specific transient error types:

private isRetryable(type: InvokeErrorType, rawError?: unknown): boolean {
  const isAbortError = (rawError as any)?.name === 'AbortError';
  if (isAbortError) return false;

  const retryableTypes: InvokeErrorType[] = [
    InvokeErrorType.NETWORK_ERROR,
    InvokeErrorType.RATE_LIMIT,
    InvokeErrorType.SERVER_ERROR,
    InvokeErrorType.NO_TOOL_CALL,
    InvokeErrorType.INVALID_TOOL_ARGS,
    InvokeErrorType.TOOL_EXECUTION_ERROR,
    InvokeErrorType.UNKNOWN,
  ];
  return retryableTypes.includes(type);
}

Retryable errors include network failures, rate limiting (HTTP 429), server errors (5xx), and model-level issues like missing tool calls or invalid arguments that might resolve on re-invocation.

UI Integration and Event Dispatch

The LLM class extends EventTarget, enabling the retry mechanism to communicate state changes to the user interface.

Event Types

Two custom events fire during the retry lifecycle:

'retry': Emitted on each retry attempt with attempt and maxAttempts in the detail object
'error': Emitted when any error occurs, including non-retryable failures

The panel implementation in packages/ui/src/panel/Panel.ts (lines 621-623) subscribes to these events to render progress indicators and error notifications.

Subscribing to Retry Events

const llm = new LLM(config);
llm.addEventListener('retry', (e) => {
  const { attempt, maxAttempts } = e.detail;
  console.log(`Retrying LLM request (${attempt}/${maxAttempts})`);
});
llm.addEventListener('error', (e) => {
  console.error('LLM failed:', e.detail.error);
});

Complete Implementation Example

The following example demonstrates configuring custom retry limits and handling the InvokeError type:

import { LLM, InvokeError } from '@page-agent/llms';

const llm = new LLM({
  baseURL: 'https://api.openai.com/v1',
  apiKey: 'YOUR_API_KEY',
  model: 'gpt-4o-mini',
  maxRetries: 3,               // Override default of 2
});

llm.addEventListener('retry', ({ detail }) => {
  console.info(`LLM retry ${detail.attempt}/${detail.maxAttempts}`);
});

(async () => {
  try {
    const result = await llm.invoke(
      [{ role: 'user', content: 'What is the capital of France?' }],
      {},
      new AbortController().signal
    );
    console.log('Success:', result);
  } catch (e) {
    if (e instanceof InvokeError && !e.retryable) {
      console.error('Fatal InvokeError:', e.type);
    } else {
      console.error('Unexpected failure:', e);
    }
  }
})();

Summary

Default retry limit: The LLM class defaults to 2 retries (configurable via maxRetries), defined in packages/llms/src/constants.ts
Error classification: InvokeError uses the isRetryable method to filter transient failures (network, rate limits, server errors) from fatal ones (abort signals)
Retry implementation: The withRetry helper in packages/llms/src/index.ts manages the attempt loop with 100ms fixed delays and abort signal checking
Event-driven UI: The class dispatches 'retry' and 'error' events, allowing UI components in packages/ui/src/panel/Panel.ts to display real-time status updates
Immediate aborts: Abort signals short-circuit the retry loop immediately without consuming retry budget

Frequently Asked Questions

How many retries does the Page-Agent LLM client attempt by default?

The default configuration specifies 2 retries, meaning the client will attempt the request up to 3 times total (initial attempt plus 2 retries). This value is defined as LLM_MAX_RETRIES in packages/llms/src/constants.ts and can be overridden via the maxRetries parameter in the LLM configuration object.

What types of errors trigger a retry versus immediate failure?

The InvokeError class classifies NETWORK_ERROR, RATE_LIMIT, SERVER_ERROR, NO_TOOL_CALL, INVALID_TOOL_ARGS, TOOL_EXECUTION_ERROR, and UNKNOWN as retryable. Conversely, explicit AbortError signals and any InvokeError with the retryable flag set to false will immediately exit the retry loop and propagate to the caller according to the logic in packages/llms/src/errors.ts.

How does the retry mechanism communicate with the UI layer?

The LLM class extends EventTarget and dispatches two custom DOM events: 'retry' (containing the current attempt number and maximum attempts) and 'error' (containing the error details). The panel component in packages/ui/src/panel/Panel.ts listens to these events to render retry progress indicators and error notifications to the user.

Can the delay between retry attempts be configured?

Currently, the withRetry function implements a fixed 100ms delay hardcoded via setTimeout in packages/llms/src/index.ts. While the source code uses this fixed value, the implementation structure allows for future enhancement to support exponential backoff or configurable delay strategies through the settings parameter.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how alibaba/page-agent works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →