How to Handle Context Window Overflow Errors in pi-ai: Detection and Recovery Guide
Handle context window overflow errors in pi-ai by using the isContextOverflow utility to detect provider-specific token limits, then trigger auto-compaction to summarize conversation history and retry the request.
When working with large language models in the badlogic/pi-mono repository, you will inevitably encounter context window overflow errors as conversations grow beyond a model's token capacity. The pi-ai package provides a robust, provider-agnostic mechanism to detect these overflows and automatically recover through intelligent conversation compaction.
Understanding Context Window Overflow Detection in pi-ai
The detection system resides in packages/ai/src/utils/overflow.ts and examines assistant responses for explicit error signals or silent token overruns.
How the isContextOverflow Function Works
The isContextOverflow(message, contextWindow?) function analyzes an AssistantMessage object to determine if a context window limit was breached. It returns true when either explicit error patterns match or when usage statistics exceed the known window size.
// Simplified logic from packages/ai/src/utils/overflow.ts
export function isContextOverflow(message: AssistantMessage, contextWindow?: number): boolean {
// Check for explicit error-based overflow
if (message.stopReason === "error" && message.errorMessage) {
if (OVERFLOW_PATTERNS.some(p => p.test(message.errorMessage!))) return true;
if (/^4(00|13)\s*(status code)?\s*\(no body\)/i.test(message.errorMessage)) return true;
}
// Check for silent overflow using usage data
if (contextWindow && message.stopReason === "stop") {
const inputTokens = message.usage.input + message.usage.cacheRead;
if (inputTokens > contextWindow) return true;
}
return false;
}
Provider-Specific Error Pattern Catalog
The OVERFLOW_PATTERNS array in overflow.ts contains regular expressions for major providers including Anthropic, OpenAI, Google Gemini, xAI, Groq, OpenRouter, llama.cpp, LM Studio, MiniMax, Kimi, and Cerebras. This catalog ensures that context window overflow errors are caught regardless of which provider's API you are using.
Silent Overflow Detection
Some providers (such as z.ai) return stopReason === "stop" rather than an error when truncating input. By passing the contextWindow parameter to isContextOverflow, you can detect these silent overflows by comparing usage.input + usage.cacheRead against the model's known limit.
Recovering from Context Window Overflow Errors
Once detected, pi-ai provides automatic recovery through the coding agent's session management system.
Auto-Compaction in Agent Sessions
The packages/coding-agent/src/core/agent-session.ts file implements recovery logic in the _checkCompaction method. When isContextOverflow returns true, the system:
- Validates model consistency – Ensures the error originated from the current model to avoid false positives when switching between models with different window sizes.
- Strips the error message – Removes the offending assistant message from session history using
replaceMessages(messages.slice(0, -1))to prevent resending the problematic payload. - Triggers compaction – Calls
_runAutoCompaction("overflow", true)to summarize conversation history and reduce token count. - Retries automatically – The
willRetry: trueflag causes the agent to re-issue the original prompt with the compacted context.
// Logic from agent-session.ts _checkCompaction method
if (sameModel && !errorIsFromBeforeCompaction && isContextOverflow(assistantMessage, contextWindow)) {
// Drop the overflow error from history
const messages = this.agent.state.messages;
if (messages.length && messages.at(-1)!.role === "assistant") {
this.agent.replaceMessages(messages.slice(0, -1));
}
await this._runAutoCompaction("overflow", true);
return;
}
Manual Overflow Handling
If you are not using the coding agent, you can implement manual recovery using the compaction utilities:
import { isContextOverflow, compact, prepareCompaction } from "@mariozechner/pi-ai";
import { calculateContextTokens, shouldCompact } from "@mariozechner/pi-ai";
async function handlePotentialOverflow(session, model, apiKey) {
const lastMessage = session.getLastAssistantMessage();
// Detect overflow
if (isContextOverflow(lastMessage, model.contextWindow)) {
console.warn("Context overflow detected – initiating compaction");
// Prepare compaction
const pathEntries = session.getBranch();
const prep = prepareCompaction(pathEntries, { sizeThreshold: 0.9 });
if (prep) {
// Execute compaction
const result = await compact(prep, model, apiKey);
session.applyCompaction(result);
// Retry your request here with compacted history
}
}
}
Key Files and Functions for Overflow Handling
| File | Role | Key Exports |
|---|---|---|
packages/ai/src/utils/overflow.ts |
Detection logic | isContextOverflow, OVERFLOW_PATTERNS |
packages/coding-agent/src/core/agent-session.ts |
Auto-recovery | _checkCompaction, _runAutoCompaction |
packages/ai/test/context-overflow.test.ts |
Validation | Test suite for provider-specific patterns |
packages/ai/src/types.ts |
Type definitions | AssistantMessage, Model, usage fields |
Summary
- Detection: Use
isContextOverflowfrompackages/ai/src/utils/overflow.tsto identify context window breaches via provider-specific regex patterns or silent usage-based detection. - Automatic Recovery: The coding agent in
agent-session.tsautomatically strips error messages, triggers compaction via_runAutoCompaction, and retries requests when overflow is detected. - Manual Implementation: Import detection and compaction utilities to build custom overflow handling workflows without the full agent framework.
- Extensibility: Add new provider patterns to
OVERFLOW_PATTERNSinoverflow.tsto support emerging LLM APIs.
Frequently Asked Questions
How does pi-ai detect context window overflow from different providers?
pi-ai maintains a comprehensive regex catalog called OVERFLOW_PATTERNS in packages/ai/src/utils/overflow.ts that includes patterns for Anthropic, OpenAI, Google Gemini, xAI, Groq, OpenRouter, and others. The isContextOverflow function tests error messages against these patterns while also checking for silent overflows by comparing usage statistics against the model's known context window.
What happens when the coding agent detects a context window overflow?
When the coding agent detects overflow in packages/coding-agent/src/core/agent-session.ts, it first removes the offending assistant message from the session history to prevent resending the problematic payload. Then it calls _runAutoCompaction with the "overflow" reason and willRetry: true, which summarizes the conversation history to reduce token count and automatically retries the original request with the compacted context.
Can I handle context window overflow manually without using the coding agent?
Yes, you can import isContextOverflow, prepareCompaction, and compact from the pi-ai package to build custom overflow handling. First detect overflow using isContextOverflow(message, model.contextWindow), then use prepareCompaction to identify which messages to summarize, and finally call compact with your model and API key to generate a condensed conversation history before retrying your request.
How do I add support for a new LLM provider that isn't in the default pattern list?
To support a new provider, locate the OVERFLOW_PATTERNS array in packages/ai/src/utils/overflow.ts and add a new regular expression that matches the provider's specific context window error message. For example, if the new provider returns "Token limit of 8192 exceeded", add /Token limit of \d+ exceeded/i to the array. The isContextOverflow function will automatically test against your new pattern on the next run.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →