How SWE-Agent Handles Atomic Tasks vs File-Level Implementation Tasks: A Deep Dive into the Developer Graph
The SWE-Agent agent breaks down high-level development requests into granular atomic tasks tracked by indices, while using distinct file-level implementation prompts to determine whether to create new files or modify existing ones.
The SWE-Agent repository (langtalks/swe-agent) employs a hierarchical task decomposition strategy where high-level development goals are split into manageable units. Understanding how the agent handles atomic tasks versus file-level implementation tasks reveals the core mechanics of its code generation pipeline and state management system.
Understanding the Task Hierarchy in SWE-Agent
The agent operates on two distinct levels of task granularity. High-level tasks represent major development objectives (such as implementing a feature or fixing a bug), while atomic tasks represent the smallest units of work—typically a single edit operation within a specific file. This separation allows the agent to plan broadly while executing precisely.
The state management system tracks these tasks through the SoftwareDeveloperState class, which maintains both the current high-level task index and the current atomic task index. This dual-index approach enables the agent to navigate complex multi-file changes methodically.
How Atomic Tasks Work
Atomic tasks form the execution layer of the SWE-Agent developer graph. Each atomic task corresponds to a concrete code change, and the agent processes them sequentially until a high-level task is complete.
State Management for Atomic Tasks
The agent tracks atomic task progression through two critical state variables in agent/developer/state.py:
atomic_tasks: A list ofAtomicTaskobjects (defined inagent/common/entities.py) representing the individual edit operations required for the current high-level taskcurrent_atomic_task_idx: An integer pointer tracking which atomic task is currently being executed
This indexing system allows the agent to pause, resume, and iterate through complex implementation steps without losing context.
Advancing Through Atomic Tasks
The proceed_to_next_atomic_task function in agent/developer/graph.py handles the transition logic between atomic tasks:
def proceed_to_next_atomic_task(state: SoftwareDeveloperState):
current_atomic_task_idx = state.current_atomic_task_idx
atomic_tasks = state.current_task.atomic_tasks
# If we've finished the current atomic task list, move to the next main task
if current_atomic_task_idx >= len(atomic_tasks) - 1:
state.current_task_idx += 1
state.current_atomic_task_idx = 0
else:
# otherwise go to the next atomic task in the same main task
state.current_atomic_task_idx = current_atomic_task_idx + 1
return state
This function resets the atomic task index when transitioning between high-level tasks, ensuring clean state isolation. When the agent exhausts all atomic tasks for the current main task, it increments current_task_idx and begins processing the next development objective.
File-Level Implementation Strategies
While atomic tasks handle the granularity of changes, file-level implementation tasks determine the strategy for applying those changes. The developer graph branches based on whether the current objective requires creating new files or modifying existing ones.
The New File Creation Path
When the agent determines that a new file is required, it follows the "Create new file" path. This branch utilizes the implement_new_file.md prompt template located in agent/developer/prompts/implement_new_file.md. The graph node create_new_file orchestrates this process, generating complete file contents rather than incremental diffs.
The Diff Modification Path
For changes to existing files, the agent uses the "Create diff" strategy. This path employs the implement_diff.md prompt template from agent/developer/prompts/implement_diff.md. The create_diff_for_task node handles the generation of patch-style modifications, allowing the agent to edit specific lines within existing codebases.
Selecting the Implementation Strategy
The workflow graph determines which path to take based on the current task requirements:
if current_task.needs_new_file:
workflow.add_edge("prepare_for_implementation", "create_new_file")
else:
workflow.add_edge("prepare_for_implementation", "create_diff_for_task")
Both paths ultimately produce atomic tasks, but the prompt engineering differs significantly. New file creation prompts focus on comprehensive structure and boilerplate, while diff prompts emphasize precise context and minimal invasive changes.
The Implementation Workflow
The complete workflow for handling atomic versus file-level tasks follows a deterministic pipeline:
- Planning Phase: The architect component (in
agent/architect/graph.py) generates high-level tasks and decomposes them into atomic units - Decomposition: Each high-level task is converted into a list of
AtomicTaskobjects with specific file targets and change descriptions - Research Phase: The optional
research_tool_nodegathers context, storing results instate.atomic_implementation_research - Implementation Phase: The
get_clear_implementation_plan_for_atomic_taskfunction constructs the execution context:
def get_clear_implementation_plan_for_atomic_task(state: SoftwareDeveloperState):
current_atomic_task = state.current_task.atomic_tasks[state.current_atomic_task_idx]
return {
"development_task": current_atomic_task.atomic_task,
"additional_context": current_atomic_task.additional_context,
"atomic_implementation_research": state.atomic_implementation_research,
}
- Execution Phase: The LLM generates either a new file or a diff based on the selected strategy
- Progression Phase:
proceed_to_next_atomic_taskadvances the state machine, either moving to the next atomic task or the next high-level task
This architecture ensures that file-level decisions (new vs existing) happen at the planning level, while atomic tasks handle the mechanical execution of changes.
Key Source Files and Architecture
| File | Purpose |
|---|---|
agent/developer/state.py |
Defines SoftwareDeveloperState with current_atomic_task_idx and atomic_tasks |
agent/developer/graph.py |
Contains proceed_to_next_atomic_task and get_clear_implementation_plan_for_atomic_task logic |
agent/developer/prompts/implement_new_file.md |
Prompt template for file creation tasks |
agent/developer/prompts/implement_diff.md |
Prompt template for modification tasks |
agent/common/entities.py |
Defines the AtomicTask dataclass structure |
agent/architect/graph.py |
Generates high-level tasks and initial decomposition |
Summary
- Atomic tasks represent single-file edit operations tracked via
current_atomic_task_idxinagent/developer/state.py - The
proceed_to_next_atomic_taskfunction manages transitions between atomic tasks and high-level tasks inagent/developer/graph.py - File-level implementation bifurcates into new file creation (
implement_new_file.md) and diff-based modification (implement_diff.md) - The
get_clear_implementation_plan_for_atomic_taskfunction assembles context for each atomic execution - State isolation ensures that completing one high-level task automatically resets the atomic task index for the next objective
Frequently Asked Questions
What defines an atomic task in SWE-Agent?
An atomic task is the smallest unit of work in the SWE-Agent system, typically representing a single edit operation within one file. Defined in agent/common/entities.py, each AtomicTask contains the specific change description, target file information, and additional context needed for implementation.
How does the agent decide between creating a new file and modifying an existing one?
The decision occurs at the workflow graph level in agent/developer/graph.py. The system checks the current task's properties (such as needs_new_file). If true, the graph routes to the create_new_file node using the implement_new_file.md prompt; otherwise, it routes to create_diff_for_task using implement_diff.md.
What happens when all atomic tasks for a main task are completed?
When current_atomic_task_idx exceeds the length of the atomic task list, the proceed_to_next_atomic_task function increments current_task_idx and resets current_atomic_task_idx to zero. This transition moves the agent to the next high-level development task in the queue.
Where is the atomic task state tracked in the codebase?
The atomic task state is maintained in the SoftwareDeveloperState class within agent/developer/state.py. This includes the atomic_tasks list and the current_atomic_task_idx integer, which together form the state machine that drives the developer graph's execution loop.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →