deep-dive

How the Architect Agent Uses `research_next_step` for Hypothesis-Driven Exploration in SWE-Agent

March 5, 2026 langtalks/swe-agent ↗

The architect agent in langtalks/swe-agent uses a three-stage loop where research_next_step stores the current hypothesis, which is validated against prior research and then executed via search or codemap tools until sufficient implementation knowledge is gathered.

The research_next_step field is the central engine that powers hypothesis-driven exploration in the Software Architect agent. This mechanism transforms high-level reasoning into concrete codebase investigation, enabling the agent to systematically gather implementation details before generating a final plan. The following sections break down exactly how this field drives the exploration loop, from hypothesis generation to tool execution.

The Three-Stage Hypothesis Loop

The architect agent follows a strict three-stage loop defined in agent/architect/graph.py. Each stage operates on the research_next_step field to move from abstract reasoning to concrete tool calls.

Stage 1: Generating the Hypothesis with `come_up_with_research_next_step`

The loop begins at lines 47-58 of agent/architect/graph.py with the come_up_with_research_next_step function. This node invokes Claude-Sonnet using the plan_next_step_prompt.md prompt to propose the next research question.

The LLM returns both a hypothesis (what to investigate) and the reasoning behind it. The function stores the hypothesis string in state.research_next_step and appends the reasoning to the implementation_research_scratchpad message history.


# Conceptual flow from agent/architect/graph.py L47-L58

def come_up_with_research_next_step(state: SoftwareArchitectState):
    # Invokes plan_next_step_prompt.md

    hypothesis = llm.invoke(prompt).content
    state.research_next_step = hypothesis
    state.implementation_research_scratchpad.append(
        AIMessage(content=reasoning)
    )
    return state

Stage 2: Validating Against Prior Research with `check_research_step`

Before executing any tools, the agent validates the hypothesis at lines 64-78 using check_research_step. This node uses the check_research_already_explored.md prompt to scan the implementation_research_scratchpad and determine if the current research_next_step has already been investigated or is redundant.

The function writes a boolean result to state.is_valid_research_step. If the hypothesis is invalid (already explored), the loop skips execution and returns to hypothesis generation.


# From agent/architect/graph.py L64-L78

def check_research_step(state: SoftwareArchitectState):
    # Uses check_research_already_explored.md

    is_valid = llm.invoke(validation_prompt).content
    state.is_valid_research_step = is_valid
    return state

Stage 3: Executing Tool Calls with `conduct_research`

When validation passes (is_valid_research_step is True), the agent proceeds to lines 80-86 where conduct_research executes the hypothesis. This node invokes the conduct_research_runnable, which binds the search and codemap tools.

The runnable uses the conduct_research_plan_prompt.md to translate the research_next_step hypothesis into specific tool calls. Results from the search or codemap tools are appended to the implementation_research_scratchpad, closing the research loop.


# From agent/architect/graph.py L80-L86

def conduct_research(state: SoftwareArchitectState):
    # conduct_research_runnable has search and codemap tools bound

    result = conduct_research_runnable.invoke({
        "research_next_step": state.research_next_step,
        "scratchpad": state.implementation_research_scratchpad
    })
    state.implementation_research_scratchpad.append(result)
    return state

The State Model: Tracking `research_next_step`

The research_next_step field is defined in agent/architect/state.py as part of the SoftwareArchitectState Pydantic model. This field serves as the single source of truth for the current hypothesis driving the exploration cycle.


# agent/architect/state.py

from typing import Optional, Annotated
from pydantic import BaseModel, Field
from langchain_core.messages import AnyMessage, add_messages

class SoftwareArchitectState(BaseModel):
    research_next_step: Optional[str] = Field(
        None,
        description="The next research step to be conducted"
    )
    implementation_research_scratchpad: Annotated[
        list[AnyMessage], add_messages
    ] = Field([], description="Scratchpad for research tool messages")
    is_valid_research_step: Optional[bool] = Field(
        None,
        description="Whether the research step is valid"
    )

The state also tracks is_valid_research_step to prevent duplicate research and implementation_research_scratchpad to accumulate tool results and reasoning across iterations.

Practical Implementation: Running the Hypothesis Loop

You can interact with the research_next_step mechanism programmatically using the functions defined in agent/architect/graph.py.

Manually Invoke Hypothesis Generation

To test the hypothesis generation step independently:

from agent.architect.graph import come_up_with_research_next_step
from agent.architect.state import SoftwareArchitectState

# Initialize empty state

state = SoftwareArchitectState()

# Generate the next hypothesis

out = come_up_with_research_next_step(state)

print("Hypothesis:", out["research_next_step"])
print("Reasoning message added to scratchpad:")
print(out["implementation_research_scratchpad"][0].content)

This invokes the plan_next_step_prompt.md prompt against Claude-Sonnet, returning a hypothesis such as "How does the repository initialise the LangGraph workflow?" and storing it in research_next_step.

Validate and Execute the Hypothesis

To run the validation and execution stages:

from agent.architect.graph import check_research_step, conduct_research

# Assume state.research_next_step is already populated

state.research_next_step = "Investigate error handling in the task runner"

# 1. Validate against prior research

valid_out = check_research_step(state)
if not valid_out["is_valid_research_step"]:
    print("Invalid hypothesis - already explored")
    print(valid_out["implementation_research_scratchpad"][-1].content)
else:
    # 2. Conduct research using search/codemap tools

    research_out = conduct_research(state)
    print("Tool results added to scratchpad:")
    print(research_out["implementation_research_scratchpad"][-1].content)

This mirrors the exact control flow used inside the compiled LangGraph workflow.

Full Workflow Execution

To run the complete hypothesis-driven exploration loop:

from agent.architect.graph import swe_architect

# swe_architect is the compiled LangGraph workflow

# It automatically cycles through:

# come_up_with_research_next_step → check_research_step → conduct_research → (repeat or exit)

result = swe_architect.invoke(
    {"workspace_repo": "./workspace_repo"}
)

print("Final implementation plan:")
print(result["implementation_plan"])

The workflow continues generating and validating research_next_step hypotheses until should_conduct_research determines that sufficient implementation knowledge has been gathered, at which point it extracts the final ImplementationPlan.

Key Files and Prompts

The hypothesis-driven exploration mechanism spans these critical files in the langtalks/swe-agent repository:

File	Purpose
`agent/architect/state.py`	Defines `SoftwareArchitectState` with `research_next_step`, `is_valid_research_step`, and `implementation_research_scratchpad` fields
`agent/architect/graph.py`	Implements the three-stage loop: `come_up_with_research_next_step` (L47-58), `check_research_step` (L64-78), and `conduct_research` (L80-86)
`agent/architect/prompts/plan_next_step_prompt.md`	Prompt template that asks Claude-Sonnet to generate the next hypothesis and reasoning
`agent/architect/prompts/check_research_already_explored.md`	Prompt template that validates whether the current `research_next_step` has already been investigated
`agent/architect/prompts/conduct_research_plan_prompt.md`	Prompt template that translates the hypothesis into specific search or codemap tool calls

Summary

The research_next_step field in agent/architect/state.py serves as the single source of truth for the architect agent's current hypothesis.
Three-stage loop: come_up_with_research_next_step generates hypotheses using plan_next_step_prompt.md, check_research_step validates against prior research using check_research_already_explored.md, and conduct_research executes tools via conduct_research_plan_prompt.md.
Deduplication: The is_valid_research_step boolean prevents redundant exploration by checking the implementation_research_scratchpad history.
Tool integration: Valid hypotheses trigger the conduct_research_runnable, which binds search and codemap tools to gather implementation details.
Loop termination: The cycle repeats until should_conduct_research determines sufficient knowledge exists to extract an ImplementationPlan.

Frequently Asked Questions

What is the purpose of `research_next_step` in the architect agent?

The research_next_step field stores the current hypothesis that drives the architect agent's exploration cycle. It acts as a bridge between high-level reasoning and concrete tool execution, ensuring each research iteration has a specific, testable question derived from the LLM's analysis of what implementation details are still needed.

How does the agent avoid duplicate research?

The agent uses the check_research_step function (defined at lines 64-78 of agent/architect/graph.py) to validate each hypothesis before execution. This function invokes the check_research_already_explored.md prompt, which compares the current research_next_step against the implementation_research_scratchpad history. If the hypothesis has already been explored, is_valid_research_step is set to False and the agent generates a new hypothesis instead.

What tools does the architect agent use during hypothesis execution?

When a hypothesis passes validation, the conduct_research function (lines 80-86 of agent/architect/graph.py) invokes the conduct_research_runnable. This runnable binds two primary tools: search (for text-based codebase queries) and codemap (for structural code analysis). The conduct_research_plan_prompt.md prompt translates the natural language hypothesis into specific tool calls using these bound tools.

How can I manually test the hypothesis generation?

You can invoke the hypothesis generation step independently by importing come_up_with_research_next_step from agent/architect/graph.py and passing an empty SoftwareArchitectState instance. This executes the plan_next_step_prompt.md against Claude-Sonnet and returns a dictionary containing the generated hypothesis in the research_next_step key and the reasoning appended to implementation_research_scratchpad. This allows you to inspect the quality of hypotheses before running the full validation and execution pipeline.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how langtalks/swe-agent works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →