# How to Handle Context Window Overflow Errors in pi-ai: Detection and Recovery Guide

> Learn to handle context window overflow errors in pi-ai. Detect token limits using isContextOverflow and trigger auto-compaction to summarize history and retry requests.

- Repository: [Mario Zechner/pi-mono](https://github.com/badlogic/pi-mono)
- Tags: how-to-guide
- Published: 2026-02-16

---

**Handle context window overflow errors in pi-ai by using the `isContextOverflow` utility to detect provider-specific token limits, then trigger auto-compaction to summarize conversation history and retry the request.**

When working with large language models in the **badlogic/pi-mono** repository, you will inevitably encounter **context window overflow errors** as conversations grow beyond a model's token capacity. The **pi-ai** package provides a robust, provider-agnostic mechanism to detect these overflows and automatically recover through intelligent conversation compaction.

## Understanding Context Window Overflow Detection in pi-ai

The detection system resides in [`packages/ai/src/utils/overflow.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/utils/overflow.ts) and examines assistant responses for explicit error signals or silent token overruns.

### How the `isContextOverflow` Function Works

The `isContextOverflow(message, contextWindow?)` function analyzes an `AssistantMessage` object to determine if a context window limit was breached. It returns `true` when either explicit error patterns match or when usage statistics exceed the known window size.

```typescript
// Simplified logic from packages/ai/src/utils/overflow.ts
export function isContextOverflow(message: AssistantMessage, contextWindow?: number): boolean {
  // Check for explicit error-based overflow
  if (message.stopReason === "error" && message.errorMessage) {
    if (OVERFLOW_PATTERNS.some(p => p.test(message.errorMessage!))) return true;
    if (/^4(00|13)\s*(status code)?\s*\(no body\)/i.test(message.errorMessage)) return true;
  }
  
  // Check for silent overflow using usage data
  if (contextWindow && message.stopReason === "stop") {
    const inputTokens = message.usage.input + message.usage.cacheRead;
    if (inputTokens > contextWindow) return true;
  }
  return false;
}

```

### Provider-Specific Error Pattern Catalog

The `OVERFLOW_PATTERNS` array in [`overflow.ts`](https://github.com/badlogic/pi-mono/blob/main/overflow.ts) contains regular expressions for major providers including Anthropic, OpenAI, Google Gemini, xAI, Groq, OpenRouter, llama.cpp, LM Studio, MiniMax, Kimi, and Cerebras. This catalog ensures that **context window overflow errors** are caught regardless of which provider's API you are using.

### Silent Overflow Detection

Some providers (such as z.ai) return `stopReason === "stop"` rather than an error when truncating input. By passing the `contextWindow` parameter to `isContextOverflow`, you can detect these silent overflows by comparing `usage.input + usage.cacheRead` against the model's known limit.

## Recovering from Context Window Overflow Errors

Once detected, **pi-ai** provides automatic recovery through the coding agent's session management system.

### Auto-Compaction in Agent Sessions

The [`packages/coding-agent/src/core/agent-session.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/src/core/agent-session.ts) file implements recovery logic in the `_checkCompaction` method. When `isContextOverflow` returns `true`, the system:

1. **Validates model consistency** – Ensures the error originated from the current model to avoid false positives when switching between models with different window sizes.
2. **Strips the error message** – Removes the offending assistant message from session history using `replaceMessages(messages.slice(0, -1))` to prevent resending the problematic payload.
3. **Triggers compaction** – Calls `_runAutoCompaction("overflow", true)` to summarize conversation history and reduce token count.
4. **Retries automatically** – The `willRetry: true` flag causes the agent to re-issue the original prompt with the compacted context.

```typescript
// Logic from agent-session.ts _checkCompaction method
if (sameModel && !errorIsFromBeforeCompaction && isContextOverflow(assistantMessage, contextWindow)) {
  // Drop the overflow error from history
  const messages = this.agent.state.messages;
  if (messages.length && messages.at(-1)!.role === "assistant") {
    this.agent.replaceMessages(messages.slice(0, -1));
  }
  await this._runAutoCompaction("overflow", true);
  return;
}

```

### Manual Overflow Handling

If you are not using the coding agent, you can implement manual recovery using the compaction utilities:

```typescript
import { isContextOverflow, compact, prepareCompaction } from "@mariozechner/pi-ai";
import { calculateContextTokens, shouldCompact } from "@mariozechner/pi-ai";

async function handlePotentialOverflow(session, model, apiKey) {
  const lastMessage = session.getLastAssistantMessage();
  
  // Detect overflow
  if (isContextOverflow(lastMessage, model.contextWindow)) {
    console.warn("Context overflow detected – initiating compaction");
    
    // Prepare compaction
    const pathEntries = session.getBranch();
    const prep = prepareCompaction(pathEntries, { sizeThreshold: 0.9 });
    
    if (prep) {
      // Execute compaction
      const result = await compact(prep, model, apiKey);
      session.applyCompaction(result);
      
      // Retry your request here with compacted history
    }
  }
}

```

## Key Files and Functions for Overflow Handling

| File | Role | Key Exports |
|------|------|-------------|
| [`packages/ai/src/utils/overflow.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/utils/overflow.ts) | Detection logic | `isContextOverflow`, `OVERFLOW_PATTERNS` |
| [`packages/coding-agent/src/core/agent-session.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/src/core/agent-session.ts) | Auto-recovery | `_checkCompaction`, `_runAutoCompaction` |
| [`packages/ai/test/context-overflow.test.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/test/context-overflow.test.ts) | Validation | Test suite for provider-specific patterns |
| [`packages/ai/src/types.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/types.ts) | Type definitions | `AssistantMessage`, `Model`, usage fields |

## Summary

- **Detection**: Use `isContextOverflow` from [`packages/ai/src/utils/overflow.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/utils/overflow.ts) to identify context window breaches via provider-specific regex patterns or silent usage-based detection.
- **Automatic Recovery**: The coding agent in [`agent-session.ts`](https://github.com/badlogic/pi-mono/blob/main/agent-session.ts) automatically strips error messages, triggers compaction via `_runAutoCompaction`, and retries requests when overflow is detected.
- **Manual Implementation**: Import detection and compaction utilities to build custom overflow handling workflows without the full agent framework.
- **Extensibility**: Add new provider patterns to `OVERFLOW_PATTERNS` in [`overflow.ts`](https://github.com/badlogic/pi-mono/blob/main/overflow.ts) to support emerging LLM APIs.

## Frequently Asked Questions

### How does pi-ai detect context window overflow from different providers?

**pi-ai** maintains a comprehensive regex catalog called `OVERFLOW_PATTERNS` in [`packages/ai/src/utils/overflow.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/utils/overflow.ts) that includes patterns for Anthropic, OpenAI, Google Gemini, xAI, Groq, OpenRouter, and others. The `isContextOverflow` function tests error messages against these patterns while also checking for silent overflows by comparing usage statistics against the model's known context window.

### What happens when the coding agent detects a context window overflow?

When the coding agent detects overflow in [`packages/coding-agent/src/core/agent-session.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/src/core/agent-session.ts), it first removes the offending assistant message from the session history to prevent resending the problematic payload. Then it calls `_runAutoCompaction` with the `"overflow"` reason and `willRetry: true`, which summarizes the conversation history to reduce token count and automatically retries the original request with the compacted context.

### Can I handle context window overflow manually without using the coding agent?

Yes, you can import `isContextOverflow`, `prepareCompaction`, and `compact` from the **pi-ai** package to build custom overflow handling. First detect overflow using `isContextOverflow(message, model.contextWindow)`, then use `prepareCompaction` to identify which messages to summarize, and finally call `compact` with your model and API key to generate a condensed conversation history before retrying your request.

### How do I add support for a new LLM provider that isn't in the default pattern list?

To support a new provider, locate the `OVERFLOW_PATTERNS` array in [`packages/ai/src/utils/overflow.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/utils/overflow.ts) and add a new regular expression that matches the provider's specific context window error message. For example, if the new provider returns "Token limit of 8192 exceeded", add `/Token limit of \d+ exceeded/i` to the array. The `isContextOverflow` function will automatically test against your new pattern on the next run.