How Dexter Manages Context to Prevent Token Overflow in LLM Agents

Dexter prevents token overflow by estimating token usage with a character-based heuristic and pruning the oldest tool results when the context exceeds a 100,000 token threshold, retaining only the 5 most recent tool outputs.

The virattt/dexter repository implements an AI agent that executes multiple tool calls across long-running conversations. To manage context and prevent token overflow, Dexter employs a lightweight estimation strategy combined with selective pruning of historical tool results, ensuring the LLM prompt stays within the model's context window while preserving critical recent information.

Token Estimation Strategy

Dexter estimates token consumption without calling an external tokenizer. In src/utils/tokens.ts, the estimateTokens() function applies a conservative heuristic based on average character-to-token ratios.

export function estimateTokens(text: string): number {
  return Math.ceil(text.length / 3.5);   // ≈ 3.5 characters per token
}

This approach provides a fast, synchronous calculation that overestimates slightly to ensure safety. The function aggregates the system prompt, user query, and concatenated tool results to generate a total count before each LLM call.

Context Thresholds and Constants

Dexter defines hard limits in src/utils/tokens.ts to determine when intervention is required.

  • CONTEXT_THRESHOLD = 100_000 — The maximum token count Dexter allows in the iteration prompt, aligned with Anthropic's default context window for Claude models.
  • KEEP_TOOL_USES = 5 — The number of most recent tool results to retain after a context clear operation.

These constants enable predictable memory usage and prevent the agent from crashing due to context window exhaustion during extended multi-step tasks.

The Context Management Loop

The core pruning logic resides in src/agent/agent.ts within the manageContextThreshold() method. After every tool execution, the agent evaluates the current context size against the threshold.

private *manageContextThreshold(ctx: RunContext): Generator<ContextClearedEvent, void> {
  const fullToolResults = ctx.scratchpad.getToolResults();
  const estimatedContextTokens = estimateTokens(this.systemPrompt + ctx.query + fullToolResults);

  if (estimatedContextTokens > CONTEXT_THRESHOLD) {
    const clearedCount = ctx.scratchpad.clearOldestToolResults(KEEP_TOOL_USES);
    if (clearedCount > 0) {
      yield { type: 'context_cleared', clearedCount, keptCount: KEEP_TOOL_USES };
    }
  }
}

When the estimate exceeds CONTEXT_THRESHOLD, the method triggers a clear operation and yields a context_cleared event containing the number of removed entries and the count retained.

Selective Pruning Without Data Loss

Dexter implements non-destructive pruning in src/agent/scratchpad.ts. The Scratchpad class maintains an append-only log of tool results on disk while using an in-memory clearedToolIndices Set to track which entries to exclude from the LLM prompt.

The clearOldestToolResults() method performs the selective removal:

clearOldestToolResults(keepCount: number): number {
  const entries = this.readEntries();
  const toolResultIndices: number[] = [];
  // collect indices of non-cleared tool_result entries …
  const toClearCount = Math.max(0, toolResultIndices.length - keepCount);
  for (let i = 0; i < toClearCount; i++) {
    this.clearedToolIndices.add(toolResultIndices[i]);
  }
  return toClearCount;
}

This design ensures that full execution history persists in the .dexter/scratchpad/*.jsonl log files for debugging and final answer generation, while the LLM receives only the most recent, relevant context to prevent token overflow.

Summary

  • Dexter estimates token usage using a fast character-based heuristic (text.length / 3.5) in src/utils/tokens.ts.
  • The agent enforces a CONTEXT_THRESHOLD of 100,000 tokens and retains only the 5 most recent tool results (KEEP_TOOL_USES) when limits are exceeded.
  • The manageContextThreshold() generator in src/agent/agent.ts evaluates context size after each tool execution and yields context_cleared events when pruning occurs.
  • Pruning is non-destructive: src/agent/scratchpad.ts uses a clearedToolIndices Set to filter old entries from the LLM prompt while preserving full history in the JSONL log files.

Frequently Asked Questions

How does Dexter estimate token count without using a tokenizer?

Dexter uses a heuristic function estimateTokens() in src/utils/tokens.ts that divides the total character count of the prompt by 3.5 and rounds up. This provides a fast, conservative estimate that slightly overcounts to ensure the context stays safely below the model's limit without the latency of calling an external tokenizer API.

What happens to old tool results after Dexter clears context?

Old tool results are excluded from the LLM prompt but remain stored in the append-only JSONL log files located in .dexter/scratchpad/. The Scratchpad class tracks cleared entries using an in-memory clearedToolIndices Set, ensuring the full execution history persists for debugging and final answer generation while keeping the active context window manageable.

Why does Dexter retain 5 tool results instead of clearing everything?

Dexter retains the 5 most recent tool results (KEEP_TOOL_USES = 5) to preserve the immediate conversational context and tool dependencies required for coherent reasoning. Clearing all history would cause the agent to lose track of recent actions and their outcomes, potentially breaking multi-step workflows that rely on sequential tool outputs.

Which LLM models does the 100,000 token threshold support?

The CONTEXT_THRESHOLD = 100_000 constant in src/utils/tokens.ts aligns with the default context window of Anthropic's Claude models (such as Claude 3.5 Sonnet and Claude 3 Opus). This threshold ensures Dexter operates safely within the limits of these widely-used models while leaving headroom for the system prompt and user query.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →