How Dexter's Agent Core Handles Iterative Tool Execution
Dexter implements an "Anthropic-style" iterative loop that repeatedly prompts an LLM with accumulated tool results, executes requested tools asynchronously while streaming progress, and manages token budgets by pruning old scratchpad entries until a final answer is generated.
Dexter, an open-source financial data agent by virattt, implements a sophisticated iterative tool execution architecture that enables multi-step reasoning with external data sources. The agent core orchestrates a continuous cycle of LLM prompting, parallel tool execution, and context management to answer complex queries through structured workflows. This design allows the agent to break down complex questions into discrete tool calls while maintaining a transparent audit trail of every operation.
The Main Execution Loop
The iterative tool execution cycle begins in Agent.run() within src/agent/agent.ts. The static factory Agent.create() initializes the appropriate LLM model and registers available tools from src/tools/registry.ts before entering the primary execution loop.
// src/agent/agent.ts – start of the loop
while (ctx.iteration < this.maxIterations) {
ctx.iteration++;
// ① Call the LLM with the current prompt
const { response, usage } = await this.callModel(currentPrompt);
ctx.tokenCounter.add(usage);
const responseText = typeof response === 'string' ? response : extractTextContent(response);
Each iteration creates a RunContext via createRunContext(query) that encapsulates the original query, a Scratchpad instance for state persistence, a TokenCounter for budget tracking, and iteration state. When the LLM response contains tool calls (hasToolCalls(response)), the agent yields a thinking event and delegates execution to the Tool Executor:
// src/agent/agent.ts – tool‑call handling
yield* this.toolExecutor.executeAll(response, ctx);
yield* this.manageContextThreshold(ctx);
If the response contains no tool calls, the agent either returns a direct answer or proceeds to the final-answer stage via generateFinalAnswer(ctx).
Parallel Tool Execution and Streaming
The AgentToolExecutor class in src/agent/tool-executor.ts handles the concurrent execution of all requested tools. The executeAll() method iterates over the tool_calls array, deduplicates redundant skill invocations, and streams progress for each operation.
// src/agent/tool-executor.ts – iterating over tool calls
for (const toolCall of response.tool_calls!) {
const toolName = toolCall.name;
const toolArgs = toolCall.args as Record<string, unknown>;
// Skip duplicate skill executions
if (toolName === 'skill' && ctx.scratchpad.hasExecutedSkill(toolArgs.skill as string)) continue;
yield* this.executeSingle(toolName, toolArgs, ctx);
}
The executeSingle() method implements several critical safeguards:
- Rate limiting: Checks
Scratchpad.canCallTool()to enforce soft limits on repetitive calls - Progress streaming: Uses a
ProgressChannelto emit real-time updates for long-running operations - Result persistence: Records successful outputs and formatted errors via
Scratchpad.addToolResult()
// src/agent/tool-executor.ts – core execution flow
yield { type: 'tool_start', tool: toolName, args: toolArgs };
const toolPromise = tool.invoke(toolArgs, config).then(
raw => { channel.close(); return raw; },
err => { channel.close(); throw err; }
);
// Stream progress events
for await (const message of channel) {
yield { type: 'tool_progress', tool: toolName, message };
}
const rawResult = await toolPromise;
const result = typeof rawResult === 'string' ? rawResult : JSON.stringify(rawResult);
yield { type: 'tool_end', tool: toolName, args: toolArgs, result, duration };
ctx.scratchpad.recordToolCall(toolName, toolQuery);
ctx.scratchpad.addToolResult(toolName, toolArgs, result);
Persistent State with the Scratchpad
All intermediate results, reasoning steps, and tool outputs append to a JSON-L log managed by the Scratchpad class in src/agent/scratchpad.ts. This append-only structure serves as the single source of truth for the agent's working memory.
// src/agent/scratchpad.ts – adding a tool result
addToolResult(toolName, args, result) {
this.append({
type: 'tool_result',
timestamp: new Date().toISOString(),
toolName,
args,
result: this.parseResultSafely(result),
});
}
The scratchpad tracks call counts and query similarity to prevent redundant operations through canCallTool(). It also provides formatted context for subsequent LLM prompts via getToolResults(). When context windows grow too large, clearOldestToolResults() removes the oldest entries while preserving the most recent N results in memory, ensuring the on-disk JSON-L log remains a complete audit trail.
Token Budget and Context Management
After each batch of tool executions, Agent.manageContextThreshold() in src/agent/agent.ts estimates the total token count of the system prompt, user query, and concatenated tool results using utilities from src/utils/tokens.ts.
// src/agent/agent.ts – context‑size guard
const estimatedContextTokens = estimateTokens(this.systemPrompt + ctx.query + fullToolResults);
if (estimatedContextTokens > CONTEXT_THRESHOLD) {
const clearedCount = ctx.scratchpad.clearOldestToolResults(KEEP_TOOL_USES);
if (clearedCount > 0) yield { type: 'context_cleared', clearedCount, keptCount: KEEP_TOOL_USES };
}
When the estimated tokens exceed CONTEXT_THRESHOLD, the system discards the oldest tool results while retaining the most recent KEEP_TOOL_USES entries. This pruning prevents context window overflow while maintaining enough recent history for coherent reasoning, emitting a context_cleared event to notify downstream consumers.
Final Answer Synthesis
When the loop terminates—either because the LLM produces no tool calls or the maxIterations limit is reached—the agent constructs a comprehensive final answer. The generateFinalAnswer() method builds a context object from the complete scratchpad history using buildFinalAnswerContext(), then prompts the LLM one final time without tool bindings:
// src/agent/agent.ts – final answer step
const fullContext = buildFinalAnswerContext(ctx.scratchpad);
const finalPrompt = buildFinalAnswerPrompt(ctx.query, fullContext);
const { response, usage } = await this.callModel(finalPrompt, false);
This final stage synthesizes all gathered evidence into a coherent response, returning the answer text along with complete metadata including all tool call records, iteration count, and token usage statistics.
Practical Implementation Example
The following example demonstrates how to instantiate and run Dexter's agent while consuming the full event stream:
import { Agent } from './src/agent/agent.js';
// Create an agent (auto‑detects model & tools)
const agent = Agent.create({ maxIterations: 8 });
// Run a query and log the streamed events
(async () => {
for await (const ev of agent.run('What are the latest earnings for AAPL?')) {
switch (ev.type) {
case 'thinking':
console.log('🤔', ev.message);
break;
case 'tool_start':
console.log(`🔧 Starting ${ev.tool}`);
break;
case 'tool_progress':
console.log(`⏳ ${ev.tool}: ${ev.message}`);
break;
case 'tool_end':
console.log(`✅ ${ev.tool} completed in ${ev.duration} ms`);
break;
case 'context_cleared':
console.log(`🧹 Cleared ${ev.clearedCount} old tool results`);
break;
case 'answer_start':
console.log('\n--- Answer ---');
break;
case 'done':
console.log(ev.answer);
console.log('\nTool calls used:', ev.toolCalls);
console.log('Iterations:', ev.iterations);
console.log('Tokens used:', ev.tokenUsage);
break;
}
}
})();
Summary
- Anthropic-style loop: Dexter implements a single-pass iterative cycle where the LLM alternates between reasoning and requesting tool calls until achieving a final answer or hitting the iteration limit.
- Async tool execution: The
AgentToolExecutorruns tools in parallel while streaming progress events throughProgressChannel, with deduplication logic to prevent redundant skill invocations. - Persistent scratchpad: All tool results append to a JSON-L log in
src/agent/scratchpad.ts, providing a complete audit trail and formatted context for subsequent prompts. - Dynamic context management: The agent monitors token budgets via
manageContextThreshold()and automatically prunes oldest results when approachingCONTEXT_THRESHOLD, keeping only the most recentKEEP_TOOL_USESentries. - Structured event streaming: The entire execution lifecycle emits typed events (
tool_start,tool_progress,context_cleared, etc.) enabling real-time UI updates and observability.
Frequently Asked Questions
How does Dexter prevent infinite tool execution loops?
Dexter enforces a hard limit through the maxIterations parameter passed to Agent.create(), which defaults to a conservative value. The main loop in src/agent/agent.ts increments ctx.iteration each cycle and terminates when this threshold is reached. Additionally, the Scratchpad class tracks per-tool call counts and query similarity, enabling soft warnings via canCallTool() to prevent repetitive identical calls.
What happens when the context window exceeds the token limit?
When manageContextThreshold() detects that the estimated tokens (system prompt + query + tool results) exceed CONTEXT_THRESHOLD, it invokes clearOldestToolResults(KEEP_TOOL_USES) on the scratchpad. This method removes the oldest tool results from the active context while preserving the most recent N entries defined by KEEP_TOOL_USES, emitting a context_cleared event to signal the pruning operation.
How does Dexter handle errors during tool execution?
The executeSingle() method in src/agent/tool-executor.ts wraps tool invocations in try-catch logic. When a tool throws an error, the catch block still records the attempt via Scratchpad.addToolResult() and stores a formatted error string as the result. This ensures that failed tool calls appear in the scratchpad history and final context, allowing the LLM to potentially recover or report the failure in the final answer.
Can Dexter execute multiple tools in parallel?
Yes. The executeAll() method iterates over all tool_calls in the LLM response and immediately invokes executeSingle() for each one. Because executeSingle() returns an async generator and tools are invoked via tool.invoke(), multiple independent tools run concurrently. The agent awaits all promises before proceeding to the next iteration, enabling efficient parallel data fetching while maintaining deterministic ordering in the scratchpad log.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →