How the Codemap Tool Analyzes Code Structure and Dependencies in SWE-Agent

The codemap tool performs static analysis using Tree-sitter to generate line-numbered maps of classes, functions, and signatures, enabling SWE-Agent's AI graphs to navigate codebases without executing the source.

The codemap module in the langtalks/swe-agent repository provides a suite of LangChain tools that transform raw source files into concise, navigable structural representations. Unlike dynamic analysis approaches, this tool relies entirely on abstract syntax tree (AST) parsing to extract definitions, making it language-agnostic and execution-safe. The resulting maps preserve original line numbers, allowing AI agents to reference exact locations when planning code modifications.

Core Static Analysis Pipeline

The codemap tool implements a six-step workflow defined in /agent/tools/codemap.py to convert source bytes into structured intelligence.

Language Detection and Parser Initialization

The process begins by mapping file extensions to Tree-sitter language identifiers. The code extracts the suffix via suffix = file_path.split(".")[-1] and resolves it through an internal lang_map dictionary to determine the target language.

Once identified, the tool initializes the appropriate grammar using tree_sitter_languages:

language = get_language(lang)
parser = get_parser(lang)

This initialization occurs at lines 28-31 in /agent/tools/codemap.py, loading compiled grammars for Python, JavaScript, TypeScript, and other supported languages.

AST Generation and Tree-Sitter Queries

After establishing the parser, the tool reads the source file as bytes and produces a complete AST:

tree = parser.parse(code)

The core intelligence emerges from a compiled S-expression query (defined at lines 37-50) that captures three structural elements:

  • Class definitions (class_definition captures)
  • Method definitions (method_definition captures)
  • Function definitions (function_definition captures)

The query string targets nodes containing identifiers, parameter lists, and bodies, which the tool compiles via language.query(query_str) for efficient execution against the parsed tree.

Structured Map Generation and Line Tracking

The extraction loop iterates over query.captures(tree.root_node) starting at line 61, processing each node to build the output. For every captured definition, the tool:

  1. Computes the source line number using node.start_point[0] + 1
  2. Inserts ellipses (...) when gaps exist between definitions to maintain visual context
  3. Formats entries as line| def function_name(...) or line| class ClassName:

The final output joins all formatted strings with return "\n".join(output_lines) (lines 100-101), producing a human-readable map that mirrors the original file's line numbering.

Dependency Analysis Strategy

While the codemap tool excels at local structure extraction, it explicitly does not resolve import graphs or cross-file dependencies internally. Instead, the SWE-Agent architecture adopts a two-step dependency discovery approach:

The Developer and Architect graphs (defined in /agent/developer/graph.py and /agent/architect/graph.py) first invoke the codemap to understand local file structure, then utilize the companion search tool to locate imported modules. Once identified, the agents request additional codemap runs on those dependency files, effectively building a dependency view through orchestration rather than static import resolution.

This separation of concerns keeps the codemap tool lightweight and language-agnostic, while allowing higher-level agents to control the scope of dependency exploration based on task requirements.

Integration with SWE-Agent Graphs

The codemap tools serve as primary introspection mechanisms for two critical agent workflows:

  • Developer graph (/agent/developer/graph.py): Imports codemap_tools and calls get_code_definitions or get_code_definitions_multi to analyze existing implementations before generating code changes
  • Architect graph (/agent/architect/graph.py): Leverages the same tools for high-level planning and system design decisions

Both graphs instantiate the tools using LangChain's @tool(parse_docstring=True) decorator pattern, enabling seamless integration into agent reasoning chains.

Practical Code Examples

Single File Structure Mapping

Retrieve a concise map of definitions from one file:

from agent.tools.codemap import get_code_definitions

file_path = "agent/tools/codemap.py"
definition_map = get_code_definitions.invoke({"file_path": file_path})
print(definition_map)

Multi-File Batch Analysis

Process multiple files simultaneously to understand cross-module relationships:

from agent.tools.codemap import get_code_definitions_multi

files = [
    "agent/tools/codemap.py",
    "agent/tools/search.py",
    "agent/tools/write.py"
]
multi_map = get_code_definitions_multi.invoke({"file_paths": files})
print(multi_map)

Function Implementation Retrieval

Extract the complete source body of a specific function:

from agent.tools.codemap import get_function_implementation

impl = get_function_implementation.invoke({
    "file_path": "agent/tools/codemap.py",
    "function_name": "get_code_definitions"
})
print(impl)

Summary

  • Tree-sitter foundation: The codemap tool uses tree_sitter_languages for language-agnostic AST parsing across Python, JavaScript, TypeScript, and other languages.
  • S-expression queries: Structural extraction relies on compiled Tree-sitter queries targeting class, method, and function definitions with their parameter lists.
  • Line-accurate output: Every entry preserves the original source line number (node.start_point[0] + 1), enabling precise code referencing.
  • External dependency resolution: Import graph analysis occurs at the agent level through composition with search tools, not within the codemap module itself.
  • Agent integration: The tools integrate directly into the Developer and Architect graphs via LangChain decorators, supporting both single-file and batch analysis operations.

Frequently Asked Questions

How does the codemap tool determine which programming language to parse?

The tool extracts the file extension using file_path.split(".")[-1] and maps it to a Tree-sitter language identifier through an internal dictionary. This suffix-to-language mapping enables the tool to initialize the correct parser from tree_sitter_languages without requiring manual language specification.

Why doesn't the codemap tool resolve import dependencies automatically?

The tool focuses exclusively on local structural extraction to maintain language agnosticism and simplicity. Dependency resolution requires import-graph analysis that varies significantly between languages (Python's import vs. JavaScript's require vs. TypeScript's import type). Instead, SWE-Agent's higher-level graphs combine codemap output with the search tool to discover dependencies and subsequently request additional codemap analysis on those files.

What is the difference between get_code_definitions and get_function_implementation?

get_code_definitions returns a line-numbered map of all classes, methods, and functions in a file, showing signatures and line locations. get_function_implementation returns the complete source code body of a specific named function or method, including its internal logic and docstrings, making it suitable for understanding implementation details rather than just structure.

Which files in the SWE-Agent repository use the codemap tool?

The primary consumers are /agent/developer/graph.py and /agent/architect/graph.py, which import codemap_tools to introspect codebases before generating modifications. The core implementation resides in /agent/tools/codemap.py, which exports get_code_definitions, get_code_definitions_multi, get_function_implementation, and get_raw_file_content as LangChain-compatible tools.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →