# Understanding the Agent Loop Implementation from Scratch: A Minimal ReAct Architecture

> Learn the agent loop implementation from scratch with this minimal ReAct architecture. Explore thought generation, tool execution, and observation in this dependency-free Python ReAct pattern.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: deep-dive
- Published: 2026-06-08

---

**The agent loop implementation in the `rohitg00/ai-engineering-from-scratch` repository is a dependency-free Python realization of the ReAct (Reason + Act) pattern, cycling through thought generation, tool execution, and observation recording until a termination condition is satisfied.**

The agent loop implementation found in **Phase 14, Lesson 01** of this curriculum demonstrates how modern AI agents function without relying on external frameworks. Built entirely with the Python standard library, this educational codebase reveals the fundamental control flow underlying production systems like Claude SDK and OpenAI Agents SDK.

## The ReAct Control Flow

At its core, the agent loop implementation follows the **ReAct pattern**—a cyclic interplay between reasoning and acting. The loop maintains a complete transcript of every interaction in a `history` buffer, allowing the LLM to reference previous thoughts and observations when planning its next move.

The cycle proceeds as follows: the user provides a query, the LLM generates a **thought** explaining its strategy, emits an **action** (a tool call), receives an **observation** (the tool output), and then **reasons** again based on that new information. This continues until the LLM determines it has sufficient information to provide a final answer.

### Initialization and Turn Management

The `AgentLoop` class initializes by appending the user query as a `Turn(kind="user")` to its `history` attribute—a `list[Turn]` that serves as the message buffer. Each subsequent iteration appends new turns representing thoughts, actions, and final answers, ensuring the LLM driver always has full context.

### The Execution Cycle

The `AgentLoop.run` method implements the core control logic with a hard limit on iterations:

1. **Iterate** up to `max_turns` times
2. Call `llm.respond(history)` to get the next move
3. If the response contains `kind="finish"`, record the final turn and return
4. Otherwise, record the thought, parse the action into a `ToolCall`, and dispatch it through the `ToolRegistry`
5. Record the observation returned by the tool as part of the action turn
6. Repeat until termination

## Core Components in [`main.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/main.py)

The implementation lives in [`phases/14-agent-engineering/01-the-agent-loop/code/main.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/01-the-agent-loop/code/main.py), which contains four primary components working in concert.

### AgentLoop Class

The `AgentLoop` class acts as the orchestrator. It maintains the `history` list and enforces the `max_turns` safety limit to prevent infinite loops. The `run` method contains the actual while-loop that drives the ReAct cycle, checking stop conditions after each LLM response.

### ToolRegistry and Dispatch

The `ToolRegistry` provides a mapping between tool names (strings the LLM outputs) and Python callables (actual functions). Its `dispatch` method executes the tool with provided arguments and returns an observation string. This design isolates tool execution from the control flow, enabling easy addition of new capabilities.

### ToyLLM Driver

`ToyLLM` is a deterministic script that simulates an LLM for testing purposes. It implements the critical `respond(history)` interface contract, returning a dictionary with either:
- `kind="finish"` and an answer, or
- A `thought` string and an `action` dict containing the tool name and arguments

Because the driver is swappable, you can replace `ToyLLM` with production APIs (Anthropic, OpenAI) without modifying the `AgentLoop` logic.

### Turn Data Structure

Each step in the conversation is captured as a `Turn` object with distinct kinds: `user`, `thought`, `action`, and `final`. The action turn specifically encapsulates both the tool call and its resulting observation, creating a complete audit trail.

## Running the Implementation

You can exercise the agent loop implementation offline using the provided demo builder:

```python
from phases_14_agent_engineering_01_the_agent_loop.code.main import build_demo_agent

# Build the demo agent (registers calculator and KV store tools)

agent = build_demo_agent()

# Ask a question that triggers a sequence of tool calls

final_answer = agent.run("What is 120 plus 15% tax, stored in kv?")

# Pretty-print the trace

from phases_14_agent_engineering_01_the_agent_loop.code.main import pretty_trace
pretty_trace(agent.history)

print("Final answer:", final_answer)

```

**Expected trace output:**

```

[00   user] What is 120 plus 15% tax, stored in kv?
[01  thought] store the base price
[02   action] kv_set({'key': 'base', 'value': '120'}) -> stored base
[03  thought] compute 15% tax
[04   action] calculator({'expr': '120 * 0.15'}) -> 18.0
[05  thought] store the tax
[06   action] kv_set({'key': 'tax', 'value': '18.0'}) -> stored tax
[07  thought] compute total
[08   action] calculator({'expr': '120 + 18.0'}) -> 138.0
[09  thought] confirm stored values
[10   action] kv_get({'key': 'base'}) -> 120
[11   final] the total including 15% tax is 138.0

```

## Extending the Architecture

The minimalist design of this agent loop implementation enables several extension paths:

- **LLM Swapping**: Replace `ToyLLM` with any client implementing the `respond` method to connect to Anthropic's Responses API, OpenAI's ChatCompletion, or local models.
- **Tool Addition**: Register new functions with `ToolRegistry` to give the agent capabilities like web search, file I/O, or database queries.
- **Parallel Execution**: Modify `AgentLoop.run` to handle multiple tool calls per turn by dispatching actions concurrently and aggregating observations.
- **Observability**: The `history` list serves as a complete replayable transcript compatible with OpenTelemetry, Langfuse, or Phoenix tracing systems.

## Summary

- The agent loop implementation uses the **ReAct pattern** (Reason + Act) to cycle between LLM reasoning and tool execution until a stop condition is met.
- Located in [`phases/14-agent-engineering/01-the-agent-loop/code/main.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/01-the-agent-loop/code/main.py), the `AgentLoop` class manages state through a `history` list of `Turn` objects and enforces safety via `max_turns`.
- The `ToolRegistry` provides dependency injection for capabilities, while `ToyLLM` demonstrates the minimal interface contract required for the LLM driver.
- This pure-Python design offers complete transparency into agent control flow, making it ideal for debugging, education, and as a foundation for production SDKs.

## Frequently Asked Questions

### What is the ReAct pattern in the context of this agent loop?

The **ReAct pattern** (Reason + Act) structures agent behavior as an alternating sequence of internal reasoning and external tool usage. In this implementation, the LLM generates a `thought` turn explaining its strategy, followed by an `action` turn containing a tool call. After observing the tool's output, the loop feeds that observation back to the LLM as context for the next reasoning step, creating a closed feedback cycle that continues until the problem is solved.

### How does the implementation prevent infinite execution?

The `AgentLoop` class enforces a **turn budget** through its `max_turns` parameter. Each iteration of the `run` method increments a counter; if the LLM fails to emit a `finish` turn before reaching this limit, the loop terminates with a "budget exhausted" status rather than running indefinitely. This safety mechanism is essential for production deployments where unbounded loops could consume excessive tokens or compute resources.

### Can I replace the ToyLLM with a production LLM API?

Yes. The `ToyLLM` class implements a simple `respond(history)` interface that returns a dictionary conforming to a specific contract: either `{"kind": "finish", "answer": ...}` or a structure containing `thought` and `action` keys. Any production client—whether for Anthropic, OpenAI, or local inference—can replace `ToyLLM` by implementing this same interface, allowing the `AgentLoop` control flow to remain completely unchanged when scaling from educational demos to production systems.

### Where are the tool definitions stored and how are they executed?

Tools are registered in the **`ToolRegistry`** class, which maintains a mapping from string names to Python callables. When the LLM outputs an action (e.g., `calculator`), the `AgentLoop` constructs a `ToolCall` object and passes it to `ToolRegistry.dispatch`, which looks up the function, executes it with the provided arguments, and returns the result as an observation string. This dispatch mechanism isolates tool execution logic from the agent's control flow, enabling safe sandboxing and error handling.