How the ReAct Agent Loop Works: Understanding the Reason-Act Pattern
The ReAct agent loop alternates between three stages—Thought, Action, and Observation—where the LLM reasons about the task, calls a tool with JSON arguments, and processes the result, repeating until a stop condition terminates the cycle.
The ReAct pattern (Reason + Act) is the canonical design for modern autonomous agents, implemented in the rohitg00/ai-engineering-from-scratch repository. Understanding how the ReAct agent loop functions is essential for building reliable AI systems that can plan, execute tools, and learn from observations. This article examines the minimal yet complete implementation found in the source code to reveal the loop's core mechanics.
The Three Stages of the ReAct Agent Loop
Each iteration of the ReAct agent loop follows a strict alternating sequence:
- Thought – The LLM generates a reasoning string describing its intent and strategy for the current step.
- Action – The model emits a tool call specifying a registered function name and JSON-encoded arguments.
- Observation – The runtime executes the tool, captures the output, and feeds it back as plain text for the next iteration.
The cycle continues until a stop condition triggers, such as the model emitting a Finish signal, reaching a maximum turn budget, or encountering a guardrail violation.
The Five Essential Ingredients
According to the source documentation in phases/14-agent-engineering/01-the-agent-loop/docs/en.md, every successful ReAct agent implementation requires five core components:
- Message buffer – Stores the chronological transcript of user messages, thoughts, actions, and observations.
- Tool registry – Maps tool names to callable Python functions available to the agent.
- Stop condition – Detects termination signals like
finish, empty tool calls, or budget exhaustion. - Turn budget – A hard cap on iterations to prevent infinite loops and runaway computation.
- Observation formatter – Converts raw tool results into LLM-readable strings for the next Thought phase.
Core Architecture and Implementation
The reference implementation in phases/14-agent-engineering/01-the-agent-loop/code/main.py uses only Python standard libraries and structures the code around the five ingredients.
ToolRegistry
The ToolRegistry class manages available tools and safely dispatches calls. It validates tool names and handles exceptions by converting them into safe error observations. The implementation resides in phases/14-agent-engineering/01-the-agent-loop/code/main.py (lines 34-45).
class ToolRegistry:
def __init__(self):
self.tools = {}
def register(self, name, fn):
self.tools[name] = fn
def call(self, name, args):
if name not in self.tools:
return f"error: tool '{name}' not found"
try:
return self.tools[name](**args)
except Exception as e:
return f"error: {e}"
ToyLLM
The ToyLLM class provides a deterministic scripted policy that yields thought, action, and finish records. This allows offline testing of the ReAct agent loop without requiring access to a real LLM API, as implemented in phases/14-agent-engineering/01-the-agent-loop/code/main.py (lines 78-95).
class ToyLLM:
def respond(self, history):
# Returns dict with keys: kind, content, tool/name, tool/arguments
# kind can be "thought", "action", or "finish"
pass
AgentLoop
The AgentLoop class orchestrates the while-loop, managing the history buffer and respecting the turn budget. Implemented in phases/14-agent-engineering/01-the-agent-loop/code/main.py (lines 98-121), it controls the flow between LLM responses and tool execution, appending each turn to the message buffer and checking termination conditions before continuing.
Running the ReAct Agent Loop
The implementation provides a build_demo_agent() function that wires together the registry, LLM, and loop. Here is how to execute a query:
from code.main import build_demo_agent, pretty_trace
# Build the demo agent
agent = build_demo_agent()
# Run the loop with a user message
final_answer = agent.run("What is 120 plus 15% tax, stored in kv?")
# Print the full transcript
pretty_trace(agent.history)
print(f"final answer: {final_answer}")
This produces a trace showing the alternating pattern:
[00 user] What is 120 plus 15% tax, stored in kv?
[01 thought] store the base price
[02 action] kv_set({'key': 'base', 'value': '120'}) -> stored base
[03 thought] compute 15% tax
[04 action] calculator({'expr': '120 * 0.15'}) -> 18.0
[05 thought] store the tax
[06 action] kv_set({'key': 'tax', 'value': '18.0'}) -> stored tax
[07 thought] compute total
[08 action] calculator({'expr': '120 + 18.0'}) -> 138.0
[09 thought] confirm stored values
[10 action] kv_get({'key': 'base'}) -> 120
[11 final] the total including 15% tax is 138.0
Extending the ReAct Agent Loop
Adding New Tools
Register new capabilities by implementing a Python function and adding it to the registry:
def search_web(query: str) -> str:
# Implementation here
return results
agent.tool_registry.register("search_web", search_web)
Modifying Stop Conditions
Replace the default reply["kind"] == "finish" check in the AgentLoop with custom logic, such as token-budget monitoring or content guardrails that inspect observations before continuing.
Swapping the LLM Backend
Substitute ToyLLM with a real provider client (OpenAI, Anthropic, etc.) that emits the same dictionary structure with thought, action, and finish keys. The surrounding control flow in the ReAct agent loop remains unchanged.
Summary
- The ReAct agent loop consists of three alternating stages: Thought, Action, and Observation.
- Five ingredients are required: a message buffer, tool registry, stop condition, turn budget, and observation formatter.
- The implementation in
rohitg00/ai-engineering-from-scratchdemonstrates the pattern using only Python standard libraries. - Key classes include
ToolRegistryfor dispatch,ToyLLMfor policy simulation, andAgentLoopfor orchestration. - The loop terminates based on explicit finish signals, empty actions, or budget exhaustion.
Frequently Asked Questions
What triggers the ReAct agent loop to stop?
The loop terminates when the LLM emits a finish signal, returns an empty tool call, reaches the maximum turn budget, or triggers a guardrail condition. These checks occur in the AgentLoop class before each new iteration begins, ensuring clean shutdown when the task completes or constraints are violated.
How does the ToolRegistry handle errors?
When a tool call fails or references a non-existent tool, the registry catches the exception and returns a formatted error string as the observation. This ensures the ReAct agent loop continues gracefully while informing the LLM of the failure, allowing the model to potentially recover or report the issue in the next Thought phase.
Can I use a real LLM instead of ToyLLM?
Yes. The ToyLLM class serves as a deterministic placeholder for testing. To use a production model, implement a class with a respond(history) method that returns dictionaries containing thought, action, or finish keys, then pass it to the AgentLoop. The architecture remains agnostic to the specific LLM provider.
Where is the complete source code for the ReAct agent loop?
The full implementation resides in phases/14-agent-engineering/01-the-agent-loop/code/main.py within the rohitg00/ai-engineering-from-scratch repository. Documentation explaining the five ingredients and architectural overview is available in phases/14-agent-engineering/01-the-agent-loop/docs/en.md, while a reusable skill artifact is stored in phases/14-agent-engineering/01-the-agent-loop/outputs/skill-agent-loop.md.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →