# How to Build an Agent Loop from Scratch in Python: A Complete ReAct Implementation

> Build a production-ready ReAct agent loop from scratch in Python. Combine message history, tool registry, and turn-based control for robust AI applications.

- Repository: [Rohit Ghumare/ai-engineering-from-scratch](https://github.com/rohitg00/ai-engineering-from-scratch)
- Tags: how-to-guide
- Published: 2026-05-21

---

**You can build a production-ready ReAct agent loop in pure Python by combining a message history buffer, a tool registry, and a turn-based control flow that iterates until a stop condition is met.**

The repository `rohitg00/ai-engineering-from-scratch` provides a minimal yet fully-featured implementation demonstrating exactly how to build an agent loop from scratch in Python without external dependencies. This ReAct-style architecture cleanly separates dialogue management, tool execution, and LLM integration into modular components that you can extend for production use.

## Core Components of the Agent Loop

The complete implementation lives in [`phases/14-agent-engineering/01-the-agent-loop/code/main.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/01-the-agent-loop/code/main.py) and consists of four essential building blocks: a message buffer to store dialogue history, a tool registry to manage external capabilities, a stop condition to terminate execution, and a turn budget to prevent infinite loops.

### Message Buffer and Turn Tracking

The agent maintains state using a `history` list that stores instances of the `Turn` dataclass. Each `Turn` records the kind of interaction (`user`, `thought`, `action`, or `final`), the text content, and optional metadata including a `ToolCall` object and observation results. This structure, defined in the `AgentLoop` class (lines 97-103), enables the loop to maintain context across multiple reasoning steps.

### Tool Registry for External Capabilities

The `ToolRegistry` class (lines 34-44) maps string names to callable functions using a dictionary:

```python
class ToolRegistry:
    def __init__(self) -> None:
        self._tools: dict[str, Callable[..., str]] = {}
    def register(self, name: str, fn: Callable[..., str]) -> None:
        self._tools[name] = fn

```

Any callable registered via `register()` must accept keyword arguments and return a string. The `dispatch` method handles execution and normalizes errors to strings prefixed with `"error:"`, ensuring the loop remains stable even when tools fail.

### Turn Budget and Termination Logic

The `AgentLoop` class accepts a `max_turns` parameter (lines 99-103) that caps the number of iterations. The loop terminates when either the turn budget is exhausted or the underlying LLM returns a response with `kind="finish"`. This dual protection prevents infinite execution while allowing the model to signal completion naturally.

## Implementing the ReAct Loop Core

The `AgentLoop.run` method (lines 104-121 in [`main.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/main.py)) implements the actual control flow:

```python
def run(self, user_message: str) -> str:
    self.history.append(Turn(kind="user", content=user_message))
    for step in range(self.max_turns):
        reply = self.llm.respond(self.history)
        if reply["kind"] == "finish":
            self.history.append(Turn(kind="final", content=reply["content"]))
            return reply["content"]
        # record the model's thought

        self.history.append(Turn(kind="thought", content=reply.get("thought", "")))
        # invoke the requested tool

        call = ToolCall(name=reply["action"], args=reply.get("args", {}))
        observation = self.tools.dispatch(call)
        self.history.append(
            Turn(kind="action", content=call.name,
                 tool_call=call, observation=observation)
        )
    self.history.append(Turn(kind="final", content="budget exhausted"))
    return "budget exhausted"

```

Each iteration follows the ReAct pattern: the LLM generates a **thought** (reasoning) and an **action** (tool call), the loop executes the tool and appends the **observation** (result) to history, then repeats. The `ToyLLM` class provides a deterministic mock for testing, but you can substitute any real LLM API that returns the same dictionary structure.

## Running the Complete Example

The `build_demo_agent()` function wires together a working agent with a calculator and key-value store. Here is how to execute a multi-step reasoning task:

```python
from phases_14_agent_engineering_01_the_agent_loop.code.main import build_demo_agent, pretty_trace

# Build the agent

agent = build_demo_agent()

# Issue a user query

final_answer = agent.run("What is 120 plus 15% tax, stored in kv?")

# Print a readable trace of the interaction

pretty_trace(agent.history)

print("\nFinal answer:", final_answer)
print("Turns used:", len([t for t in agent.history if t.kind == "action"]))
print("Tools used:", agent.tools.names())

```

The `pretty_trace` helper (lines 124-137) formats the execution history into a readable log showing each thought, action, and observation, which is invaluable for debugging agent behavior.

## Extending for Production Use

Once you understand the base implementation in [`phases/14-agent-engineering/01-the-agent-loop/code/main.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/01-the-agent-loop/code/main.py), you can adapt it for production scenarios:

- **Swap in a real LLM**: Replace `ToyLLM.respond` with a method calling `openai.ChatCompletion.create` or any other provider. The `AgentLoop.run` method remains unchanged as long as the response follows the `{"kind": "...", "thought": "...", "action": "...", "args": {...}}` schema.
- **Add domain-specific tools**: Register new capabilities using `tools.register("my_tool", my_callable)`. The registry handles arbitrary callables that accept kwargs and return strings.
- **Implement persistent memory**: Replace the in-memory `KVStore` (lines 66-75) with a database-backed implementation or extend the history buffer to serialize to disk between sessions.
- **Customize stop criteria**: Modify the `run` method to check for additional termination signals such as token limits, user interrupts, or external flags.

## Summary

- The agent loop in `rohitg00/ai-engineering-from-scratch` implements a ReAct architecture using only Python standard library components.
- **Four core components** manage the loop: the `Turn` history buffer, the `ToolRegistry`, the `max_turns` budget, and the `finish` stop condition.
- The `AgentLoop.run` method orchestrates the cycle of thought → action → observation until the task completes or the budget exhausts.
- You can extend the toy implementation into a production system by swapping `ToyLLM` for a real API provider and registering additional domain-specific tools.

## Frequently Asked Questions

### What is the ReAct pattern in AI agents?

The ReAct (Reasoning + Acting) pattern is an agent architecture where the language model alternates between generating reasoning traces ("thoughts") and executing actions (tool calls). Observations from tool execution are fed back into the model's context, enabling multi-step problem solving. According to the source code in [`phases/14-agent-engineering/01-the-agent-loop/code/main.py`](https://github.com/rohitg00/ai-engineering-from-scratch/blob/main/phases/14-agent-engineering/01-the-agent-loop/code/main.py), this pattern is implemented by appending `Turn` objects of kind `thought`, `action`, and observation to the history buffer in sequential iterations.

### How does the turn budget prevent infinite loops?

The `AgentLoop` class accepts a `max_turns` parameter that limits the `for` loop in the `run` method (lines 104-121). If the LLM fails to emit a `finish` signal, the loop automatically terminates after `max_turns` iterations and returns `"budget exhausted"`. This safety mechanism ensures that agent execution remains bounded even when the model enters repetitive or confused reasoning patterns.

### Can I replace the ToyLLM with OpenAI or other providers?

Yes. The `ToyLLM` is a deterministic mock that returns scripted responses for demonstration purposes. To use a real provider, create a class with a `respond(self, history)` method that calls your LLM API, parses the response into the required dictionary format with keys `kind`, `thought`, `action`, and `args`, and returns it. The `AgentLoop.run` method will handle the rest of the orchestration without modification.

### How do I add custom tools to the registry?

Register new tools using the `ToolRegistry.register` method. Your callable must accept keyword arguments and return a string. For example: `agent.tools.register("search", lambda query: "results...")`. The `dispatch` method automatically validates that the tool exists, executes it with the provided arguments, and formats any exceptions as error strings, ensuring the agent loop remains robust against tool failures.