# How Dexter Manages Context to Prevent Token Overflow in LLM Agents

> Dexter prevents token overflow in LLM agents by using a character heuristic to estimate token usage and pruning old tool results to stay under the 100,000 token limit.

- Repository: [Virat Singh/dexter](https://github.com/virattt/dexter)
- Tags: internals
- Published: 2026-02-16

---

**Dexter prevents token overflow by estimating token usage with a character-based heuristic and pruning the oldest tool results when the context exceeds a 100,000 token threshold, retaining only the 5 most recent tool outputs.**

The `virattt/dexter` repository implements an AI agent that executes multiple tool calls across long-running conversations. To manage context and prevent token overflow, Dexter employs a lightweight estimation strategy combined with selective pruning of historical tool results, ensuring the LLM prompt stays within the model's context window while preserving critical recent information.

## Token Estimation Strategy

Dexter estimates token consumption without calling an external tokenizer. In [`src/utils/tokens.ts`](https://github.com/virattt/dexter/blob/main/src/utils/tokens.ts), the `estimateTokens()` function applies a conservative heuristic based on average character-to-token ratios.

```typescript
export function estimateTokens(text: string): number {
  return Math.ceil(text.length / 3.5);   // ≈ 3.5 characters per token
}

```

This approach provides a fast, synchronous calculation that overestimates slightly to ensure safety. The function aggregates the **system prompt**, **user query**, and **concatenated tool results** to generate a total count before each LLM call.

## Context Thresholds and Constants

Dexter defines hard limits in [`src/utils/tokens.ts`](https://github.com/virattt/dexter/blob/main/src/utils/tokens.ts) to determine when intervention is required.

- **`CONTEXT_THRESHOLD = 100_000`** — The maximum token count Dexter allows in the iteration prompt, aligned with Anthropic's default context window for Claude models.
- **`KEEP_TOOL_USES = 5`** — The number of most recent tool results to retain after a context clear operation.

These constants enable predictable memory usage and prevent the agent from crashing due to context window exhaustion during extended multi-step tasks.

## The Context Management Loop

The core pruning logic resides in [`src/agent/agent.ts`](https://github.com/virattt/dexter/blob/main/src/agent/agent.ts) within the `manageContextThreshold()` method. After every tool execution, the agent evaluates the current context size against the threshold.

```typescript
private *manageContextThreshold(ctx: RunContext): Generator<ContextClearedEvent, void> {
  const fullToolResults = ctx.scratchpad.getToolResults();
  const estimatedContextTokens = estimateTokens(this.systemPrompt + ctx.query + fullToolResults);

  if (estimatedContextTokens > CONTEXT_THRESHOLD) {
    const clearedCount = ctx.scratchpad.clearOldestToolResults(KEEP_TOOL_USES);
    if (clearedCount > 0) {
      yield { type: 'context_cleared', clearedCount, keptCount: KEEP_TOOL_USES };
    }
  }
}

```

When the estimate exceeds `CONTEXT_THRESHOLD`, the method triggers a clear operation and yields a `context_cleared` event containing the number of removed entries and the count retained.

## Selective Pruning Without Data Loss

Dexter implements non-destructive pruning in [`src/agent/scratchpad.ts`](https://github.com/virattt/dexter/blob/main/src/agent/scratchpad.ts). The `Scratchpad` class maintains an append-only log of tool results on disk while using an in-memory `clearedToolIndices` Set to track which entries to exclude from the LLM prompt.

The `clearOldestToolResults()` method performs the selective removal:

```typescript
clearOldestToolResults(keepCount: number): number {
  const entries = this.readEntries();
  const toolResultIndices: number[] = [];
  // collect indices of non-cleared tool_result entries …
  const toClearCount = Math.max(0, toolResultIndices.length - keepCount);
  for (let i = 0; i < toClearCount; i++) {
    this.clearedToolIndices.add(toolResultIndices[i]);
  }
  return toClearCount;
}

```

This design ensures that **full execution history persists** in the `.dexter/scratchpad/*.jsonl` log files for debugging and final answer generation, while the LLM receives only the most recent, relevant context to prevent token overflow.

## Summary

- Dexter estimates token usage using a fast character-based heuristic (`text.length / 3.5`) in [`src/utils/tokens.ts`](https://github.com/virattt/dexter/blob/main/src/utils/tokens.ts).
- The agent enforces a `CONTEXT_THRESHOLD` of 100,000 tokens and retains only the 5 most recent tool results (`KEEP_TOOL_USES`) when limits are exceeded.
- The `manageContextThreshold()` generator in [`src/agent/agent.ts`](https://github.com/virattt/dexter/blob/main/src/agent/agent.ts) evaluates context size after each tool execution and yields `context_cleared` events when pruning occurs.
- Pruning is non-destructive: [`src/agent/scratchpad.ts`](https://github.com/virattt/dexter/blob/main/src/agent/scratchpad.ts) uses a `clearedToolIndices` Set to filter old entries from the LLM prompt while preserving full history in the JSONL log files.

## Frequently Asked Questions

### How does Dexter estimate token count without using a tokenizer?

Dexter uses a heuristic function `estimateTokens()` in [`src/utils/tokens.ts`](https://github.com/virattt/dexter/blob/main/src/utils/tokens.ts) that divides the total character count of the prompt by 3.5 and rounds up. This provides a fast, conservative estimate that slightly overcounts to ensure the context stays safely below the model's limit without the latency of calling an external tokenizer API.

### What happens to old tool results after Dexter clears context?

Old tool results are excluded from the LLM prompt but remain stored in the append-only JSONL log files located in `.dexter/scratchpad/`. The `Scratchpad` class tracks cleared entries using an in-memory `clearedToolIndices` Set, ensuring the full execution history persists for debugging and final answer generation while keeping the active context window manageable.

### Why does Dexter retain 5 tool results instead of clearing everything?

Dexter retains the 5 most recent tool results (`KEEP_TOOL_USES = 5`) to preserve the immediate conversational context and tool dependencies required for coherent reasoning. Clearing all history would cause the agent to lose track of recent actions and their outcomes, potentially breaking multi-step workflows that rely on sequential tool outputs.

### Which LLM models does the 100,000 token threshold support?

The `CONTEXT_THRESHOLD = 100_000` constant in [`src/utils/tokens.ts`](https://github.com/virattt/dexter/blob/main/src/utils/tokens.ts) aligns with the default context window of Anthropic's Claude models (such as Claude 3.5 Sonnet and Claude 3 Opus). This threshold ensures Dexter operates safely within the limits of these widely-used models while leaving headroom for the system prompt and user query.