# Understanding the AgentState Data Structure in ai‑hedge‑fund

> Explore the AgentState data structure in ai-hedge-fund. Learn how this central container manages LLM messages, analysis data, and workflow metadata for simulations. Understand its implementation and impact.

- Repository: [Virat Singh/ai-hedge-fund](https://github.com/virattt/ai-hedge-fund)
- Tags: internals
- Published: 2026-03-09

---

**The AgentState is a TypedDict that serves as the central runtime container for LLM messages, mutable analysis data, and workflow metadata in the ai‑hedge‑fund simulation, implemented across [`src/graph/state.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/graph/state.py) and [`src/data/models.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/data/models.py).**

The **AgentState** data structure forms the backbone of the graph-based workflow in the [virattt/ai-hedge-fund](https://github.com/virattt/ai-hedge-fund) repository. It enables type-safe state management as data flows between analyst agents, risk managers, and the portfolio manager in this LLM-driven hedge fund simulation.

## Core Components of AgentState

The architecture separates runtime state management from domain-specific data models. This separation allows LangGraph to handle state transitions while Pydantic models enforce data integrity for financial calculations.

### The AgentState TypedDict (Runtime Container)

Located in [`src/graph/state.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/graph/state.py), the `AgentState` class is a `TypedDict` that defines the shape of data passed between graph nodes:

```python
class AgentState(TypedDict):
    messages: Annotated[Sequence[BaseMessage], operator.add]
    data:     Annotated[dict[str, any], merge_dicts]
    metadata: Annotated[dict[str, any], merge_dicts]

```

- **`messages`** – Accumulates `BaseMessage` objects from LangChain; agents append responses using `operator.add` for list concatenation.
- **`data`** – Stores mutable analysis payloads merged via the `merge_dicts` helper function. In practice, this holds serialized `AgentStateData` objects.
- **`metadata`** – Contains workflow-wide flags and configuration, also merged incrementally.

### The AgentStateData Pydantic Model (Domain Payload)

Defined in [`src/data/models.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/data/models.py), `AgentStateData` is a `BaseModel` subclass that structures the actual financial data:

```python
class AgentStateData(BaseModel):
    tickers:          list[str]
    portfolio:        Portfolio
    start_date:       str
    end_date:         str
    ticker_analyses:  dict[str, TickerAnalysis]

```

- **`tickers`** – List of stock symbols under analysis (e.g., `["AAPL", "MSFT"]`).
- **`portfolio`** – Current cash positions, margin requirements, and holdings.
- **`ticker_analyses`** – Mapping of symbols to `TickerAnalysis` objects, each containing `analyst_signals` from individual agents like `valuation_analyst_agent` or `technical_analyst_agent`.

### The AgentStateMetadata Pydantic Model (Workflow Configuration)

Also in [`src/data/models.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/data/models.py), this model controls execution behavior:

```python
class AgentStateMetadata(BaseModel):
    show_reasoning: bool = False
    model_config = {"extra": "allow"}

```

- **`show_reasoning`** – Boolean flag that agents check to determine whether to print intermediate LLM reasoning steps.
- **`extra: "allow"`** – Permits additional keys like `model_name` and `model_provider` for dynamic LLM configuration overrides.

## How AgentState Flows Through the Workflow

Understanding the lifecycle of the **AgentState data structure** reveals how the hedge fund simulation maintains deterministic state across asynchronous agent execution.

### Initialization in src/main.py

The workflow begins with state construction:

```python
from src.graph.state import AgentState
from src.data.models import AgentStateData, AgentStateMetadata, Portfolio

portfolio = Portfolio(
    cash=100_000,
    margin_requirement=0.5,
    positions={},
    realized_gains={},
)

initial_state: AgentState = {
    "messages": [],
    "data": AgentStateData(
        tickers=["AAPL", "MSFT"],
        portfolio=portfolio,
        start_date="2023-01-01",
        end_date="2023-12-31",
        ticker_analyses={},
    ).model_dump(),
    "metadata": AgentStateMetadata(show_reasoning=True).model_dump(),
}

```

The `model_dump()` calls serialize Pydantic objects into plain dictionaries required by the graph engine.

### Agent Node Execution

Each analyst agent receives the current state and returns partial updates. In [`src/agents/valuation.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/agents/valuation.py) and similar files, functions follow this signature:

```python
from src.graph.state import AgentState

def valuation_analyst_agent(state: AgentState, agent_id: str = "valuation_analyst_agent"):
    tickers = state["data"]["tickers"]
    
    # Perform valuation analysis...

    
    new_data = {
        "ticker_analyses": {
            "AAPL": {
                "ticker": "AAPL",
                "analyst_signals": {
                    "valuation_analyst_agent": {
                        "signal": "buy",
                        "confidence": 0.87,
                        "reasoning": "PE ratio below sector average"
                    }
                }
            }
        }
    }
    
    return {"data": new_data}

```

The `merge_dicts` function defined in [`src/graph/state.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/graph/state.py) automatically merges these fragments into the existing state without overwriting unrelated fields.

### Metadata Access for Debugging

Agents conditionally output reasoning based on metadata flags:

```python
from src.graph.state import show_agent_reasoning

def risk_management_agent(state: AgentState, agent_id: str = "risk_management_agent"):
    # Risk calculations...

    if state["metadata"]["show_reasoning"]:
        show_agent_reasoning(risk_report, agent_id)
    return {"data": updated_risk_data}

```

## Summary

- **AgentState** is a `TypedDict` defined in [`src/graph/state.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/graph/state.py) that serves as the runtime state container for the LangGraph workflow, separating concerns between messages, data, and metadata.
- **AgentStateData** (Pydantic model in [`src/data/models.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/data/models.py)) enforces type safety for financial data including tickers, portfolios, and analyst signals.
- **AgentStateMetadata** controls workflow behavior such as reasoning visibility and LLM provider configuration.
- The `merge_dicts` helper enables immutable-style updates while using plain dictionaries for graph processing efficiency.
- State flows from initialization in [`src/main.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/main.py) through analyst agents to the final portfolio manager, with each node returning partial state updates that are automatically merged.

## Frequently Asked Questions

### How does AgentState handle concurrent updates from multiple analysts?

The `Annotated` type hints in `AgentState` specify merge strategies: `operator.add` concatenates message lists, while the custom `merge_dicts` function in [`src/graph/state.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/graph/state.py) performs deep dictionary merging. This ensures that when multiple agents return updates simultaneously, their `ticker_analyses` entries merge without overwriting each other's signals.

### What is the difference between AgentState and AgentStateData?

**AgentState** is the runtime container (TypedDict) required by LangGraph's `StateGraph`, while **AgentStateData** is a Pydantic BaseModel that defines the actual financial schema. The `data` field inside `AgentState` holds a serialized dictionary representation of `AgentStateData`, allowing the graph engine to process state transitions while maintaining type-safe data validation.

### Can I extend AgentState to include custom fields for new analyst agents?

Yes, the `AgentStateMetadata` model uses Pydantic's `extra: "allow"` configuration, permitting additional keys for custom workflow flags. For new data fields, modify `AgentStateData` in [`src/data/models.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/data/models.py) or nest custom dictionaries within the existing `ticker_analyses` structure without breaking the core graph logic in [`src/graph/state.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/graph/state.py).

### Where is the initial AgentState created in the codebase?

The initialization occurs in [`src/main.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/main.py) within the `run_hedge_fund` function and related helpers. This module constructs the `Portfolio` object, instantiates `AgentStateData` and `AgentStateMetadata`, serializes them via `model_dump()`, and assembles the initial `AgentState` dictionary passed to the workflow compiler.