How to Build an Agent Loop from Scratch in Python: A Complete ReAct Implementation
You can build a production-ready ReAct agent loop in pure Python by combining a message history buffer, a tool registry, and a turn-based control flow that iterates until a stop condition is met.
The repository rohitg00/ai-engineering-from-scratch provides a minimal yet fully-featured implementation demonstrating exactly how to build an agent loop from scratch in Python without external dependencies. This ReAct-style architecture cleanly separates dialogue management, tool execution, and LLM integration into modular components that you can extend for production use.
Core Components of the Agent Loop
The complete implementation lives in phases/14-agent-engineering/01-the-agent-loop/code/main.py and consists of four essential building blocks: a message buffer to store dialogue history, a tool registry to manage external capabilities, a stop condition to terminate execution, and a turn budget to prevent infinite loops.
Message Buffer and Turn Tracking
The agent maintains state using a history list that stores instances of the Turn dataclass. Each Turn records the kind of interaction (user, thought, action, or final), the text content, and optional metadata including a ToolCall object and observation results. This structure, defined in the AgentLoop class (lines 97-103), enables the loop to maintain context across multiple reasoning steps.
Tool Registry for External Capabilities
The ToolRegistry class (lines 34-44) maps string names to callable functions using a dictionary:
class ToolRegistry:
def __init__(self) -> None:
self._tools: dict[str, Callable[..., str]] = {}
def register(self, name: str, fn: Callable[..., str]) -> None:
self._tools[name] = fn
Any callable registered via register() must accept keyword arguments and return a string. The dispatch method handles execution and normalizes errors to strings prefixed with "error:", ensuring the loop remains stable even when tools fail.
Turn Budget and Termination Logic
The AgentLoop class accepts a max_turns parameter (lines 99-103) that caps the number of iterations. The loop terminates when either the turn budget is exhausted or the underlying LLM returns a response with kind="finish". This dual protection prevents infinite execution while allowing the model to signal completion naturally.
Implementing the ReAct Loop Core
The AgentLoop.run method (lines 104-121 in main.py) implements the actual control flow:
def run(self, user_message: str) -> str:
self.history.append(Turn(kind="user", content=user_message))
for step in range(self.max_turns):
reply = self.llm.respond(self.history)
if reply["kind"] == "finish":
self.history.append(Turn(kind="final", content=reply["content"]))
return reply["content"]
# record the model's thought
self.history.append(Turn(kind="thought", content=reply.get("thought", "")))
# invoke the requested tool
call = ToolCall(name=reply["action"], args=reply.get("args", {}))
observation = self.tools.dispatch(call)
self.history.append(
Turn(kind="action", content=call.name,
tool_call=call, observation=observation)
)
self.history.append(Turn(kind="final", content="budget exhausted"))
return "budget exhausted"
Each iteration follows the ReAct pattern: the LLM generates a thought (reasoning) and an action (tool call), the loop executes the tool and appends the observation (result) to history, then repeats. The ToyLLM class provides a deterministic mock for testing, but you can substitute any real LLM API that returns the same dictionary structure.
Running the Complete Example
The build_demo_agent() function wires together a working agent with a calculator and key-value store. Here is how to execute a multi-step reasoning task:
from phases_14_agent_engineering_01_the_agent_loop.code.main import build_demo_agent, pretty_trace
# Build the agent
agent = build_demo_agent()
# Issue a user query
final_answer = agent.run("What is 120 plus 15% tax, stored in kv?")
# Print a readable trace of the interaction
pretty_trace(agent.history)
print("\nFinal answer:", final_answer)
print("Turns used:", len([t for t in agent.history if t.kind == "action"]))
print("Tools used:", agent.tools.names())
The pretty_trace helper (lines 124-137) formats the execution history into a readable log showing each thought, action, and observation, which is invaluable for debugging agent behavior.
Extending for Production Use
Once you understand the base implementation in phases/14-agent-engineering/01-the-agent-loop/code/main.py, you can adapt it for production scenarios:
- Swap in a real LLM: Replace
ToyLLM.respondwith a method callingopenai.ChatCompletion.createor any other provider. TheAgentLoop.runmethod remains unchanged as long as the response follows the{"kind": "...", "thought": "...", "action": "...", "args": {...}}schema. - Add domain-specific tools: Register new capabilities using
tools.register("my_tool", my_callable). The registry handles arbitrary callables that accept kwargs and return strings. - Implement persistent memory: Replace the in-memory
KVStore(lines 66-75) with a database-backed implementation or extend the history buffer to serialize to disk between sessions. - Customize stop criteria: Modify the
runmethod to check for additional termination signals such as token limits, user interrupts, or external flags.
Summary
- The agent loop in
rohitg00/ai-engineering-from-scratchimplements a ReAct architecture using only Python standard library components. - Four core components manage the loop: the
Turnhistory buffer, theToolRegistry, themax_turnsbudget, and thefinishstop condition. - The
AgentLoop.runmethod orchestrates the cycle of thought → action → observation until the task completes or the budget exhausts. - You can extend the toy implementation into a production system by swapping
ToyLLMfor a real API provider and registering additional domain-specific tools.
Frequently Asked Questions
What is the ReAct pattern in AI agents?
The ReAct (Reasoning + Acting) pattern is an agent architecture where the language model alternates between generating reasoning traces ("thoughts") and executing actions (tool calls). Observations from tool execution are fed back into the model's context, enabling multi-step problem solving. According to the source code in phases/14-agent-engineering/01-the-agent-loop/code/main.py, this pattern is implemented by appending Turn objects of kind thought, action, and observation to the history buffer in sequential iterations.
How does the turn budget prevent infinite loops?
The AgentLoop class accepts a max_turns parameter that limits the for loop in the run method (lines 104-121). If the LLM fails to emit a finish signal, the loop automatically terminates after max_turns iterations and returns "budget exhausted". This safety mechanism ensures that agent execution remains bounded even when the model enters repetitive or confused reasoning patterns.
Can I replace the ToyLLM with OpenAI or other providers?
Yes. The ToyLLM is a deterministic mock that returns scripted responses for demonstration purposes. To use a real provider, create a class with a respond(self, history) method that calls your LLM API, parses the response into the required dictionary format with keys kind, thought, action, and args, and returns it. The AgentLoop.run method will handle the rest of the orchestration without modification.
How do I add custom tools to the registry?
Register new tools using the ToolRegistry.register method. Your callable must accept keyword arguments and return a string. For example: agent.tools.register("search", lambda query: "results..."). The dispatch method automatically validates that the tool exists, executes it with the provided arguments, and formats any exceptions as error strings, ensuring the agent loop remains robust against tool failures.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →