Understanding the Agent Loop Implementation from Scratch: A Minimal ReAct Architecture
The agent loop implementation in the rohitg00/ai-engineering-from-scratch repository is a dependency-free Python realization of the ReAct (Reason + Act) pattern, cycling through thought generation, tool execution, and observation recording until a termination condition is satisfied.
The agent loop implementation found in Phase 14, Lesson 01 of this curriculum demonstrates how modern AI agents function without relying on external frameworks. Built entirely with the Python standard library, this educational codebase reveals the fundamental control flow underlying production systems like Claude SDK and OpenAI Agents SDK.
The ReAct Control Flow
At its core, the agent loop implementation follows the ReAct pattern—a cyclic interplay between reasoning and acting. The loop maintains a complete transcript of every interaction in a history buffer, allowing the LLM to reference previous thoughts and observations when planning its next move.
The cycle proceeds as follows: the user provides a query, the LLM generates a thought explaining its strategy, emits an action (a tool call), receives an observation (the tool output), and then reasons again based on that new information. This continues until the LLM determines it has sufficient information to provide a final answer.
Initialization and Turn Management
The AgentLoop class initializes by appending the user query as a Turn(kind="user") to its history attribute—a list[Turn] that serves as the message buffer. Each subsequent iteration appends new turns representing thoughts, actions, and final answers, ensuring the LLM driver always has full context.
The Execution Cycle
The AgentLoop.run method implements the core control logic with a hard limit on iterations:
- Iterate up to
max_turnstimes - Call
llm.respond(history)to get the next move - If the response contains
kind="finish", record the final turn and return - Otherwise, record the thought, parse the action into a
ToolCall, and dispatch it through theToolRegistry - Record the observation returned by the tool as part of the action turn
- Repeat until termination
Core Components in main.py
The implementation lives in phases/14-agent-engineering/01-the-agent-loop/code/main.py, which contains four primary components working in concert.
AgentLoop Class
The AgentLoop class acts as the orchestrator. It maintains the history list and enforces the max_turns safety limit to prevent infinite loops. The run method contains the actual while-loop that drives the ReAct cycle, checking stop conditions after each LLM response.
ToolRegistry and Dispatch
The ToolRegistry provides a mapping between tool names (strings the LLM outputs) and Python callables (actual functions). Its dispatch method executes the tool with provided arguments and returns an observation string. This design isolates tool execution from the control flow, enabling easy addition of new capabilities.
ToyLLM Driver
ToyLLM is a deterministic script that simulates an LLM for testing purposes. It implements the critical respond(history) interface contract, returning a dictionary with either:
kind="finish"and an answer, or- A
thoughtstring and anactiondict containing the tool name and arguments
Because the driver is swappable, you can replace ToyLLM with production APIs (Anthropic, OpenAI) without modifying the AgentLoop logic.
Turn Data Structure
Each step in the conversation is captured as a Turn object with distinct kinds: user, thought, action, and final. The action turn specifically encapsulates both the tool call and its resulting observation, creating a complete audit trail.
Running the Implementation
You can exercise the agent loop implementation offline using the provided demo builder:
from phases_14_agent_engineering_01_the_agent_loop.code.main import build_demo_agent
# Build the demo agent (registers calculator and KV store tools)
agent = build_demo_agent()
# Ask a question that triggers a sequence of tool calls
final_answer = agent.run("What is 120 plus 15% tax, stored in kv?")
# Pretty-print the trace
from phases_14_agent_engineering_01_the_agent_loop.code.main import pretty_trace
pretty_trace(agent.history)
print("Final answer:", final_answer)
Expected trace output:
[00 user] What is 120 plus 15% tax, stored in kv?
[01 thought] store the base price
[02 action] kv_set({'key': 'base', 'value': '120'}) -> stored base
[03 thought] compute 15% tax
[04 action] calculator({'expr': '120 * 0.15'}) -> 18.0
[05 thought] store the tax
[06 action] kv_set({'key': 'tax', 'value': '18.0'}) -> stored tax
[07 thought] compute total
[08 action] calculator({'expr': '120 + 18.0'}) -> 138.0
[09 thought] confirm stored values
[10 action] kv_get({'key': 'base'}) -> 120
[11 final] the total including 15% tax is 138.0
Extending the Architecture
The minimalist design of this agent loop implementation enables several extension paths:
- LLM Swapping: Replace
ToyLLMwith any client implementing therespondmethod to connect to Anthropic's Responses API, OpenAI's ChatCompletion, or local models. - Tool Addition: Register new functions with
ToolRegistryto give the agent capabilities like web search, file I/O, or database queries. - Parallel Execution: Modify
AgentLoop.runto handle multiple tool calls per turn by dispatching actions concurrently and aggregating observations. - Observability: The
historylist serves as a complete replayable transcript compatible with OpenTelemetry, Langfuse, or Phoenix tracing systems.
Summary
- The agent loop implementation uses the ReAct pattern (Reason + Act) to cycle between LLM reasoning and tool execution until a stop condition is met.
- Located in
phases/14-agent-engineering/01-the-agent-loop/code/main.py, theAgentLoopclass manages state through ahistorylist ofTurnobjects and enforces safety viamax_turns. - The
ToolRegistryprovides dependency injection for capabilities, whileToyLLMdemonstrates the minimal interface contract required for the LLM driver. - This pure-Python design offers complete transparency into agent control flow, making it ideal for debugging, education, and as a foundation for production SDKs.
Frequently Asked Questions
What is the ReAct pattern in the context of this agent loop?
The ReAct pattern (Reason + Act) structures agent behavior as an alternating sequence of internal reasoning and external tool usage. In this implementation, the LLM generates a thought turn explaining its strategy, followed by an action turn containing a tool call. After observing the tool's output, the loop feeds that observation back to the LLM as context for the next reasoning step, creating a closed feedback cycle that continues until the problem is solved.
How does the implementation prevent infinite execution?
The AgentLoop class enforces a turn budget through its max_turns parameter. Each iteration of the run method increments a counter; if the LLM fails to emit a finish turn before reaching this limit, the loop terminates with a "budget exhausted" status rather than running indefinitely. This safety mechanism is essential for production deployments where unbounded loops could consume excessive tokens or compute resources.
Can I replace the ToyLLM with a production LLM API?
Yes. The ToyLLM class implements a simple respond(history) interface that returns a dictionary conforming to a specific contract: either {"kind": "finish", "answer": ...} or a structure containing thought and action keys. Any production client—whether for Anthropic, OpenAI, or local inference—can replace ToyLLM by implementing this same interface, allowing the AgentLoop control flow to remain completely unchanged when scaling from educational demos to production systems.
Where are the tool definitions stored and how are they executed?
Tools are registered in the ToolRegistry class, which maintains a mapping from string names to Python callables. When the LLM outputs an action (e.g., calculator), the AgentLoop constructs a ToolCall object and passes it to ToolRegistry.dispatch, which looks up the function, executes it with the provided arguments, and returns the result as an observation string. This dispatch mechanism isolates tool execution logic from the agent's control flow, enabling safe sandboxing and error handling.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →