deep-dive

Understanding implementation_research_scratchpad in SWE-Agent State Management

March 5, 2026 langtalks/swe-agent ↗

The implementation_research_scratchpad is a message-history buffer that preserves the complete research dialogue between SWE-Agent's architect agent and its tools, enabling context-aware planning and deterministic state serialization.

The SWE-Agent repository implements a multi-agent system for automated software engineering. At the heart of its research and planning phase lies the implementation_research_scratchpad, a specialized state field defined in agent/architect/state.py that maintains conversational context between the architect agent and its tool invocations.

Core Purpose of the implementation_research_scratchpad

The implementation_research_scratchpad serves as the central message buffer for the architect agent's research phase. Unlike a simple string log, it stores a structured list of message objects (including AIMessage and HumanMessage instances) that represent the complete dialogue history between the agent and its tools.

This design allows the architect to maintain context across multiple research iterations, ensuring that each planning decision incorporates the full history of hypotheses, validations, and tool outputs.

Where It Lives in the Codebase

The scratchpad is formally defined in agent/architect/state.py as a field within the SoftwareArchitectState Pydantic model:

from agent.architect.state import SoftwareArchitectState

# Initialize state with empty scratchpad

state = SoftwareArchitectState()

# The implementation_research_scratchpad field starts as an empty list

How implementation_research_scratchpad Powers the Research Loop

The scratchpad operates as a write-read cycle throughout the architect agent's execution graph defined in agent/architect/graph.py.

Collecting Research Dialogue

Every AI-generated hypothesis and reasoning step is appended to the scratchpad as an AIMessage. When the architect determines its next research step, it writes structured content into this buffer:

from langchain_core.messages import AIMessage

def come_up_with_research_next_step(state):
    response = plan_next_step_runnable.invoke({
        "implementation_research_scratchpad": state.implementation_research_scratchpad,
        "codebase_structure": get_files_structure.invoke({"directory": "./workspace_repo"}),
    })
    
    # Append AI message to the scratchpad

    return {
        "research_next_step": response.hypothesis,
        "implementation_research_scratchpad": [
            AIMessage(
                content=f"My next thing I need to check is {response.hypothesis}. "
                        f"This is why I think it is useful: {response.reasoning}"
            )
        ],
    }

Feeding Context to Prompts

The scratchpad serves as an input variable to critical prompt templates. When invoking plan_next_step_runnable or check_research_prompt, the system passes the current scratchpad content so the LLM can reason with the complete research history:


# Inside agent/architect/graph.py

plan_next_step_runnable.invoke({
    "implementation_research_scratchpad": state.implementation_research_scratchpad,
    # ... other variables

})

ToolNode Integration

The ToolNode that executes search and codemap tools is explicitly configured with messages_key="implementation_research_scratchpad". This ensures that tool calls (stored as ToolMessage objects) and their results are automatically appended to the correct state field:


# Configuration in agent/architect/graph.py

tool_node = ToolNode(
    tools=[search, codemap],
    messages_key="implementation_research_scratchpad"
)

After a tool executes, the resulting HumanMessage containing the tool output lands directly in the scratchpad buffer.

State Persistence and Resumability

Because implementation_research_scratchpad is part of the Pydantic SoftwareArchitectState model, it inherits full serialization capabilities. This enables:

Deterministic state flow: The entire research conversation can be saved to disk and resumed later
Traceability: Every hypothesis and tool result is preserved for debugging and auditing
Clean handoff: The developer agent receives only the distilled implementation_plan, not the raw scratchpad, ensuring separation of concerns between research and implementation phases

Summary

The implementation_research_scratchpad is a structured message buffer defined in agent/architect/state.py that stores the complete research dialogue for SWE-Agent's architect agent.
It operates as a centralized history for AI hypotheses, tool invocations, and validation results, enabling context-aware planning through plan_next_step_runnable and related prompts.
The ToolNode configuration in agent/architect/graph.py automatically routes tool outputs to this scratchpad using messages_key="implementation_research_scratchpad".
As part of the Pydantic state model, it provides deterministic serialization for workflow resumability while maintaining clean separation from the developer agent's implementation phase.

Frequently Asked Questions

What is the difference between implementation_research_scratchpad and the developer agent's state?

The implementation_research_scratchpad exists only within the architect agent's state (SoftwareArchitectState) and contains the raw research conversation including hypotheses and tool results. The developer agent receives only the distilled implementation_plan field and does not access the scratchpad, ensuring the developer focuses on execution rather than research history.

How does the ToolNode know to write to implementation_research_scratchpad?

In agent/architect/graph.py, the ToolNode is explicitly instantiated with the parameter messages_key="implementation_research_scratchpad". This configuration tells LangGraph to automatically append any tool call results as messages to this specific state field rather than a default messages key.

Can the implementation_research_scratchpad be persisted across sessions?

Yes. Because the scratchpad is part of the Pydantic-based SoftwareArchitectState model, the entire state object—including the scratchpad message list—can be serialized to JSON or other formats, saved to disk, and restored later. This enables resumable research workflows and full audit trails of the agent's reasoning process.

Why is implementation_research_scratchpad implemented as a list of messages rather than a string?

The list-of-messages structure (containing AIMessage, HumanMessage, and ToolMessage objects) preserves the semantic role of each communication—distinguishing between AI hypotheses, human/tool inputs, and function results. This structured format is required by LangGraph's ToolNode and allows prompts like plan_next_step_runnable to process the conversation history with full metadata intact, rather than parsing a flat text log.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how langtalks/swe-agent works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →