How the AI-Hedge-Fund Backtester Works: Architecture and Implementation Guide

The ai-hedge-fund backtester simulates trading strategies by iterating over historical market data, executing AI-generated decisions through a virtual portfolio engine, and calculating performance metrics including Sharpe ratio and max drawdown.

The virattt/ai-hedge-fund repository provides a modular simulation engine that tests algorithmic trading agents against historical market conditions. At its core, the ai-hedge-fund backtester orchestrates data ingestion, portfolio management, and trade execution while enforcing realistic constraints like margin requirements and cash limits. This article examines the source code architecture behind the simulation loop, from CLI argument parsing to final performance reporting.

Entry Point and CLI Configuration

The simulation begins in src/backtester.py, which serves as the command-line interface. It parses user arguments—including tickers, date ranges, initial capital, model configurations, and analyst selections—then instantiates the BacktestEngine.


# src/backtester.py (lines 42-66)

inputs = parse_cli_inputs(...)
backtester = BacktestEngine(
    agent=run_hedge_fund,
    tickers=inputs.tickers,
    start_date=inputs.start_date,
    end_date=inputs.end_date,
    initial_capital=inputs.initial_cash,
    model_name=inputs.model_name,
    model_provider=inputs.model_provider,
    selected_analysts=inputs.selected_analysts,
    initial_margin_requirement=inputs.margin_requirement,
)
performance_metrics = run_backtest(backtester)

The parse_cli_inputs function from src/cli/input.py handles validation and default values, ensuring the engine receives properly formatted parameters before execution begins.

BacktestEngine Architecture

The src/backtesting/engine.py file defines the BacktestEngine class, which wires together six specialized components to manage the simulation state:

  • Portfolio: Tracks cash balances, long/short positions, margin requirements, and realized gains.
  • TradeExecutor: Translates agent decisions into portfolio updates via specific action handlers.
  • AgentController: Invokes the trading agent (run_hedge_fund) and normalizes its output to a standardized format.
  • PerformanceMetricsCalculator: Computes Sharpe ratio, Sortino ratio, maximum drawdown, and other risk-adjusted metrics.
  • BenchmarkCalculator: Retrieves SPY price series to provide relative performance comparisons.
  • OutputBuilder: Formats daily simulation results into readable tables showing actions, exposures, and portfolio values.

This separation of concerns allows individual components to be tested and extended independently without modifying the core simulation loop.

Data Pre-fetching Strategy

Before the simulation loop executes, the _prefetch_data() method loads historical data to minimize API latency during iteration. The engine retrieves one year of price history, fundamentals, insider trades, and news for each ticker, plus SPY data for benchmarking.


# src/backtesting/engine.py (lines 81-95)

for ticker in self._tickers:
    get_prices(ticker, start_date_str, self._end_date)
    get_financial_metrics(ticker, self._end_date, limit=10)
    get_insider_trades(ticker, self._end_date, start_date=self._start_date, limit=1000)
    get_company_news(ticker, self._end_date, start_date=self._start_date, limit=1000)

# Benchmark data

get_prices("SPY", self._start_date, self._end_date)

These calls leverage the API wrappers defined in src/tools/api.py, ensuring consistent data formatting across price feeds and fundamental sources.

The Main Simulation Loop

The run_backtest() method in src/backtesting/engine.py iterates over business days between the specified start and end dates, executing a seven-step workflow for each trading day.

Look-back Windows and Price Capture

For each date, the engine constructs a one-month look-back window (lookback_start) to provide context for the trading agent. The get_price_data function retrieves the most recent closing price for each ticker; if any ticker lacks data for the current day, the engine skips that date to prevent errors from incomplete market data.

Agent Decision Normalization

The AgentController.run_agent method (defined in src/backtesting/controller.py, lines 12-40) invokes the user-specified agent with the look-back dates, current portfolio snapshot, model configuration, and selected analysts. The controller enforces a strict output schema, normalizing decisions into a dictionary mapping tickers to actions and quantities: {ticker: {"action": "BUY", "quantity": 100}}.

Trade Execution Flow

The TradeExecutor.execute_trade method in src/backtesting/trader.py (lines 27-35) routes normalized decisions to the appropriate portfolio methods:


# src/backtesting/trader.py (lines 27-35)

if action_enum == Action.BUY:   
    return portfolio.apply_long_buy(...)
elif action_enum == Action.SELL:
    return portfolio.apply_long_sell(...)
elif action_enum == Action.SHORT:
    return portfolio.apply_short_open(...)
elif action_enum == Action.COVER:
    return portfolio.apply_short_cover(...)

This mapping ensures that buy, sell, short, and cover actions trigger the correct cash and position adjustments.

Portfolio Valuation and Exposures

After execution, src/backtesting/valuation.py recalculates the portfolio state. The calculate_portfolio_value function sums cash, long market value, and short market value (as a liability), while compute_exposures returns long exposure, short exposure, gross exposure, net exposure, and the long/short ratio.


# src/backtesting/valuation.py (lines 8-21, 24-49)

total_value = calculate_portfolio_value(portfolio, current_prices)
exposures = compute_exposures(portfolio, current_prices)

Daily Reporting and Metrics Update

The OutputBuilder.build_day_rows method in src/backtesting/output.py formats a daily record containing the date, agent signals, executed trades, prices, exposures, and total portfolio value. Rows are prepended to the output list so the most recent day appears at the top.

Once at least three portfolio value data points exist, the PerformanceMetricsCalculator.compute_metrics method updates the running Sharpe ratio, Sortino ratio, and maximum drawdown statistics:


# src/backtesting/engine.py (lines 84-88)

if len(self._portfolio_values) > 3:
    computed = self._perf.compute_metrics(self._portfolio_values)
    self._performance_metrics.update(computed)

Portfolio Mechanics and Margin Handling

The src/backtesting/portfolio.py module implements realistic margin accounting for short positions. When opening a short, the portfolio adds cash proceeds but reduces available buying power by the margin_requirement percentage (typically 50%). The module tracks average cost basis for both long and short positions and records realized gains or losses upon closing positions.

Key methods include:

  • apply_long_buy: Deducts cash and increases long position quantity.
  • apply_long_sell: Reduces position size, updates realized gains, and returns cash.
  • apply_short_open: Adds cash proceeds and locks margin collateral.
  • apply_short_cover: Closes short position, calculates P&L, and releases margin.

Performance Metrics and Benchmarking

The src/backtesting/metrics.py module calculates risk-adjusted returns using the portfolio value time series. It computes annualized Sharpe ratios (using risk-free rate assumptions), Sortino ratios (downside deviation only), and maximum drawdown percentages.

Comparative performance comes from src/backtesting/benchmarks.py, which fetches SPY historical returns to calculate daily benchmark percentages. This allows the output tables to show both absolute strategy performance and relative alpha generation against the S&P 500.

Running the Backtester: CLI and Programmatic Examples

You can invoke the backtester via command line or instantiate the engine programmatically:

Command Line:

python src/backtester.py \
    --tickers AAPL MSFT \
    --start-date 2022-01-01 \
    --end-date 2022-12-31 \
    --initial-cash 100000 \
    --model-name llama2 \
    --model-provider ollama

Programmatic:

from src.backtesting.engine import BacktestEngine
from src.main import run_hedge_fund

engine = BacktestEngine(
    agent=run_hedge_fund,
    tickers=["AAPL", "MSFT"],
    start_date="2022-01-01",
    end_date="2022-12-31",
    initial_capital=100_000,
    model_name="llama2",
    model_provider="ollama",
    selected_analysts=None,
    initial_margin_requirement=0.5,
)

metrics = engine.run_backtest()
print("Final performance:", metrics)

Both methods execute the same simulation loop, producing identical performance metrics and daily output tables.

Summary

  • The ai-hedge-fund backtester entry point in src/backtester.py parses CLI arguments and initializes the BacktestEngine with component wiring.
  • The engine pre-fetches historical prices, fundamentals, and news data via src/tools/api.py to minimize latency during simulation.
  • Each trading day invokes the AI agent through AgentController, normalizes decisions, and executes trades via TradeExecutor with realistic margin constraints.
  • Portfolio valuation and exposure calculations occur daily through src/backtesting/valuation.py, supporting both long and short positions.
  • Performance metrics (Sharpe, Sortino, max drawdown) and SPY benchmarking update continuously throughout the simulation.
  • The modular architecture allows easy extension for custom agents, alternative data sources, or additional reporting formats.

Frequently Asked Questions

How does the backtester handle missing price data?

During each iteration of the main simulation loop, the engine attempts to retrieve current prices for all tickers. If any ticker lacks price data for the current business day, the engine skips that day entirely and proceeds to the next date. This prevents the simulation from executing trades based on incomplete market snapshots.

What performance metrics does the ai-hedge-fund backtester calculate?

The PerformanceMetricsCalculator in src/backtesting/metrics.py computes annualized Sharpe ratios, Sortino ratios (measuring downside risk only), maximum drawdown percentages, and long/short exposure ratios. These metrics update dynamically once at least three portfolio value observations exist in the simulation history.

Can I use a custom trading agent with the BacktestEngine?

Yes. The BacktestEngine accepts any callable agent via the agent parameter in its constructor. Your custom agent must accept the same signature as run_hedge_fund (look-back dates, portfolio snapshot, model configuration) and return decisions that the AgentController can normalize into the {ticker: {action, quantity}} format required by the TradeExecutor.

How does the backtester simulate short selling and margin requirements?

The Portfolio class in src/backtesting/portfolio.py implements short selling by adding cash proceeds when opening a short position while simultaneously reserving collateral based on the initial_margin_requirement parameter (default 0.5 or 50%). The system tracks average cost basis for short positions and calculates realized gains or losses when covering, ensuring accurate buying power calculations throughout the simulation.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →