How the Page-Agent LLM Client Handles Retries and Error Recovery
The LLM client in the @page-agent/llms package implements a resilient retry mechanism through the withRetry helper, which respects abort signals, classifies errors via the InvokeError class, and emits DOM events to notify UI components of retry attempts and final failures.
The alibaba/page-agent repository provides a modular LLM integration layer where the LLM class serves as a thin wrapper around concrete clients like OpenAIClient. Understanding how this package handles transient failures is critical for building robust agent applications that can recover from network issues, rate limits, and model-level errors without manual intervention.
The Three-Component Retry Architecture
The retry strategy relies on three cooperating systems defined across the package:
- Configuration (
constants.ts): Supplies the defaultmaxRetriesvalue (2) and model parameters throughparseLLMConfiginpackages/llms/src/index.ts - Error Classification (
errors.ts): TheInvokeErrorclass determines retry eligibility via itsisRetryablemethod (lines 24-57) - Retry Driver (
index.ts): ThewithRetryasync helper executes the request loop and manages backoff timing (lines 76-112)
The Request Flow and Retry Loop
When consuming code calls LLM.invoke() in packages/llms/src/index.ts, the method delegates execution to the withRetry helper.
Wrapping the Client Invocation
The invoke method constructs an anonymous async function that checks abort signals before delegating to the underlying client:
async invoke(
messages: Message[],
tools: Record<string, Tool>,
abortSignal: AbortSignal,
options?: InvokeOptions
): Promise<InvokeResult> {
return await withRetry(
async () => {
if (abortSignal.aborted) throw new Error('AbortError');
return await this.client.invoke(messages, tools, abortSignal, options);
},
{
maxRetries: this.config.maxRetries,
onRetry: (attempt) => this.dispatchEvent(
new CustomEvent('retry', { detail: { attempt, maxAttempts: this.config.maxRetries } })
),
onError: (error) => this.dispatchEvent(
new CustomEvent('error', { detail: { error } })
),
}
);
}
The withRetry Implementation
The withRetry function implements a while-loop that attempts the operation up to maxRetries + 1 times:
async function withRetry<T>(fn: () => Promise<T>, settings: { … }): Promise<T> {
let attempt = 0;
let lastError: Error | null = null;
while (attempt <= settings.maxRetries) {
if (attempt > 0) {
settings.onRetry(attempt);
await new Promise(r => setTimeout(r, 100));
}
try {
return await fn();
} catch (error: unknown) {
if ((error as any)?.rawError?.name === 'AbortError') throw error;
console.error(error);
settings.onError(error as Error);
if (error instanceof InvokeError && !error.retryable) throw error;
lastError = error as Error;
attempt++;
await new Promise(r => setTimeout(r, 100));
}
}
throw lastError!;
}
Key behaviors include:
- Immediate abort: If the error name is
AbortError, the loop exits immediately without consuming retry attempts - Fixed backoff: A 100ms delay is inserted before each retry attempt via
setTimeout - Final error propagation: After exhausting all retries, the last captured error is re-thrown to the caller
Error Classification Logic
The InvokeError class in packages/llms/src/errors.ts encapsulates error metadata and determines whether a failure warrants another attempt.
The isRetryable Method
The private isRetryable method returns false for explicit aborts and true for specific transient error types:
private isRetryable(type: InvokeErrorType, rawError?: unknown): boolean {
const isAbortError = (rawError as any)?.name === 'AbortError';
if (isAbortError) return false;
const retryableTypes: InvokeErrorType[] = [
InvokeErrorType.NETWORK_ERROR,
InvokeErrorType.RATE_LIMIT,
InvokeErrorType.SERVER_ERROR,
InvokeErrorType.NO_TOOL_CALL,
InvokeErrorType.INVALID_TOOL_ARGS,
InvokeErrorType.TOOL_EXECUTION_ERROR,
InvokeErrorType.UNKNOWN,
];
return retryableTypes.includes(type);
}
Retryable errors include network failures, rate limiting (HTTP 429), server errors (5xx), and model-level issues like missing tool calls or invalid arguments that might resolve on re-invocation.
UI Integration and Event Dispatch
The LLM class extends EventTarget, enabling the retry mechanism to communicate state changes to the user interface.
Event Types
Two custom events fire during the retry lifecycle:
'retry': Emitted on each retry attempt withattemptandmaxAttemptsin the detail object'error': Emitted when any error occurs, including non-retryable failures
The panel implementation in packages/ui/src/panel/Panel.ts (lines 621-623) subscribes to these events to render progress indicators and error notifications.
Subscribing to Retry Events
const llm = new LLM(config);
llm.addEventListener('retry', (e) => {
const { attempt, maxAttempts } = e.detail;
console.log(`Retrying LLM request (${attempt}/${maxAttempts})`);
});
llm.addEventListener('error', (e) => {
console.error('LLM failed:', e.detail.error);
});
Complete Implementation Example
The following example demonstrates configuring custom retry limits and handling the InvokeError type:
import { LLM, InvokeError } from '@page-agent/llms';
const llm = new LLM({
baseURL: 'https://api.openai.com/v1',
apiKey: 'YOUR_API_KEY',
model: 'gpt-4o-mini',
maxRetries: 3, // Override default of 2
});
llm.addEventListener('retry', ({ detail }) => {
console.info(`LLM retry ${detail.attempt}/${detail.maxAttempts}`);
});
(async () => {
try {
const result = await llm.invoke(
[{ role: 'user', content: 'What is the capital of France?' }],
{},
new AbortController().signal
);
console.log('Success:', result);
} catch (e) {
if (e instanceof InvokeError && !e.retryable) {
console.error('Fatal InvokeError:', e.type);
} else {
console.error('Unexpected failure:', e);
}
}
})();
Summary
- Default retry limit: The
LLMclass defaults to 2 retries (configurable viamaxRetries), defined inpackages/llms/src/constants.ts - Error classification:
InvokeErroruses theisRetryablemethod to filter transient failures (network, rate limits, server errors) from fatal ones (abort signals) - Retry implementation: The
withRetryhelper inpackages/llms/src/index.tsmanages the attempt loop with 100ms fixed delays and abort signal checking - Event-driven UI: The class dispatches
'retry'and'error'events, allowing UI components inpackages/ui/src/panel/Panel.tsto display real-time status updates - Immediate aborts: Abort signals short-circuit the retry loop immediately without consuming retry budget
Frequently Asked Questions
How many retries does the Page-Agent LLM client attempt by default?
The default configuration specifies 2 retries, meaning the client will attempt the request up to 3 times total (initial attempt plus 2 retries). This value is defined as LLM_MAX_RETRIES in packages/llms/src/constants.ts and can be overridden via the maxRetries parameter in the LLM configuration object.
What types of errors trigger a retry versus immediate failure?
The InvokeError class classifies NETWORK_ERROR, RATE_LIMIT, SERVER_ERROR, NO_TOOL_CALL, INVALID_TOOL_ARGS, TOOL_EXECUTION_ERROR, and UNKNOWN as retryable. Conversely, explicit AbortError signals and any InvokeError with the retryable flag set to false will immediately exit the retry loop and propagate to the caller according to the logic in packages/llms/src/errors.ts.
How does the retry mechanism communicate with the UI layer?
The LLM class extends EventTarget and dispatches two custom DOM events: 'retry' (containing the current attempt number and maximum attempts) and 'error' (containing the error details). The panel component in packages/ui/src/panel/Panel.ts listens to these events to render retry progress indicators and error notifications to the user.
Can the delay between retry attempts be configured?
Currently, the withRetry function implements a fixed 100ms delay hardcoded via setTimeout in packages/llms/src/index.ts. While the source code uses this fixed value, the implementation structure allows for future enhancement to support exponential backoff or configurable delay strategies through the settings parameter.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →