# How the Codemap Tool Analyzes Code Structure and Dependencies in SWE-Agent

> Discover how the Codemap tool uses Tree-sitter for static analysis to map code structure and dependencies. SWE-Agent navigates codebases efficiently without execution.

- Repository: [LangTalks/swe-agent](https://github.com/langtalks/swe-agent)
- Tags: how-to-guide
- Published: 2026-03-05

---

**The codemap tool performs static analysis using Tree-sitter to generate line-numbered maps of classes, functions, and signatures, enabling SWE-Agent's AI graphs to navigate codebases without executing the source.**

The **codemap** module in the `langtalks/swe-agent` repository provides a suite of LangChain tools that transform raw source files into concise, navigable structural representations. Unlike dynamic analysis approaches, this tool relies entirely on **abstract syntax tree (AST)** parsing to extract definitions, making it language-agnostic and execution-safe. The resulting maps preserve original line numbers, allowing AI agents to reference exact locations when planning code modifications.

## Core Static Analysis Pipeline

The codemap tool implements a six-step workflow defined in [`/agent/tools/codemap.py`](https://github.com/langtalks/swe-agent/blob/main//agent/tools/codemap.py) to convert source bytes into structured intelligence.

### Language Detection and Parser Initialization

The process begins by mapping file extensions to Tree-sitter language identifiers. The code extracts the suffix via `suffix = file_path.split(".")[-1]` and resolves it through an internal `lang_map` dictionary to determine the target language.

Once identified, the tool initializes the appropriate grammar using `tree_sitter_languages`:

```python
language = get_language(lang)
parser = get_parser(lang)

```

This initialization occurs at lines 28-31 in [`/agent/tools/codemap.py`](https://github.com/langtalks/swe-agent/blob/main//agent/tools/codemap.py), loading compiled grammars for Python, JavaScript, TypeScript, and other supported languages.

### AST Generation and Tree-Sitter Queries

After establishing the parser, the tool reads the source file as bytes and produces a complete AST:

```python
tree = parser.parse(code)

```

The core intelligence emerges from a compiled **S-expression query** (defined at lines 37-50) that captures three structural elements:

- **Class definitions** (`class_definition` captures)
- **Method definitions** (`method_definition` captures)
- **Function definitions** (`function_definition` captures)

The query string targets nodes containing identifiers, parameter lists, and bodies, which the tool compiles via `language.query(query_str)` for efficient execution against the parsed tree.

### Structured Map Generation and Line Tracking

The extraction loop iterates over `query.captures(tree.root_node)` starting at line 61, processing each node to build the output. For every captured definition, the tool:

1. Computes the source line number using `node.start_point[0] + 1`
2. Inserts ellipses (`...`) when gaps exist between definitions to maintain visual context
3. Formats entries as `line| def function_name(...)` or `line| class ClassName:`

The final output joins all formatted strings with `return "\n".join(output_lines)` (lines 100-101), producing a human-readable map that mirrors the original file's line numbering.

## Dependency Analysis Strategy

While the codemap tool excels at **local structure extraction**, it explicitly **does not resolve import graphs** or cross-file dependencies internally. Instead, the SWE-Agent architecture adopts a **two-step dependency discovery** approach:

The **Developer** and **Architect** graphs (defined in [`/agent/developer/graph.py`](https://github.com/langtalks/swe-agent/blob/main//agent/developer/graph.py) and [`/agent/architect/graph.py`](https://github.com/langtalks/swe-agent/blob/main//agent/architect/graph.py)) first invoke the codemap to understand local file structure, then utilize the companion **search** tool to locate imported modules. Once identified, the agents request additional codemap runs on those dependency files, effectively building a dependency view through orchestration rather than static import resolution.

This separation of concerns keeps the codemap tool lightweight and language-agnostic, while allowing higher-level agents to control the scope of dependency exploration based on task requirements.

## Integration with SWE-Agent Graphs

The codemap tools serve as primary introspection mechanisms for two critical agent workflows:

- **Developer graph** ([`/agent/developer/graph.py`](https://github.com/langtalks/swe-agent/blob/main//agent/developer/graph.py)): Imports `codemap_tools` and calls `get_code_definitions` or `get_code_definitions_multi` to analyze existing implementations before generating code changes
- **Architect graph** ([`/agent/architect/graph.py`](https://github.com/langtalks/swe-agent/blob/main//agent/architect/graph.py)): Leverages the same tools for high-level planning and system design decisions

Both graphs instantiate the tools using LangChain's `@tool(parse_docstring=True)` decorator pattern, enabling seamless integration into agent reasoning chains.

## Practical Code Examples

### Single File Structure Mapping

Retrieve a concise map of definitions from one file:

```python
from agent.tools.codemap import get_code_definitions

file_path = "agent/tools/codemap.py"
definition_map = get_code_definitions.invoke({"file_path": file_path})
print(definition_map)

```

### Multi-File Batch Analysis

Process multiple files simultaneously to understand cross-module relationships:

```python
from agent.tools.codemap import get_code_definitions_multi

files = [
    "agent/tools/codemap.py",
    "agent/tools/search.py",
    "agent/tools/write.py"
]
multi_map = get_code_definitions_multi.invoke({"file_paths": files})
print(multi_map)

```

### Function Implementation Retrieval

Extract the complete source body of a specific function:

```python
from agent.tools.codemap import get_function_implementation

impl = get_function_implementation.invoke({
    "file_path": "agent/tools/codemap.py",
    "function_name": "get_code_definitions"
})
print(impl)

```

## Summary

- **Tree-sitter foundation**: The codemap tool uses `tree_sitter_languages` for language-agnostic AST parsing across Python, JavaScript, TypeScript, and other languages.
- **S-expression queries**: Structural extraction relies on compiled Tree-sitter queries targeting class, method, and function definitions with their parameter lists.
- **Line-accurate output**: Every entry preserves the original source line number (`node.start_point[0] + 1`), enabling precise code referencing.
- **External dependency resolution**: Import graph analysis occurs at the agent level through composition with search tools, not within the codemap module itself.
- **Agent integration**: The tools integrate directly into the Developer and Architect graphs via LangChain decorators, supporting both single-file and batch analysis operations.

## Frequently Asked Questions

### How does the codemap tool determine which programming language to parse?

The tool extracts the file extension using `file_path.split(".")[-1]` and maps it to a Tree-sitter language identifier through an internal dictionary. This suffix-to-language mapping enables the tool to initialize the correct parser from `tree_sitter_languages` without requiring manual language specification.

### Why doesn't the codemap tool resolve import dependencies automatically?

The tool focuses exclusively on local structural extraction to maintain language agnosticism and simplicity. Dependency resolution requires import-graph analysis that varies significantly between languages (Python's `import` vs. JavaScript's `require` vs. TypeScript's `import type`). Instead, SWE-Agent's higher-level graphs combine codemap output with the search tool to discover dependencies and subsequently request additional codemap analysis on those files.

### What is the difference between `get_code_definitions` and `get_function_implementation`?

`get_code_definitions` returns a line-numbered map of all classes, methods, and functions in a file, showing signatures and line locations. `get_function_implementation` returns the complete source code body of a specific named function or method, including its internal logic and docstrings, making it suitable for understanding implementation details rather than just structure.

### Which files in the SWE-Agent repository use the codemap tool?

The primary consumers are [`/agent/developer/graph.py`](https://github.com/langtalks/swe-agent/blob/main//agent/developer/graph.py) and [`/agent/architect/graph.py`](https://github.com/langtalks/swe-agent/blob/main//agent/architect/graph.py), which import `codemap_tools` to introspect codebases before generating modifications. The core implementation resides in [`/agent/tools/codemap.py`](https://github.com/langtalks/swe-agent/blob/main//agent/tools/codemap.py), which exports `get_code_definitions`, `get_code_definitions_multi`, `get_function_implementation`, and `get_raw_file_content` as LangChain-compatible tools.