how-to-guide

How to Handle Rate Limiting and API Quotas Across Different Providers in AI-Hedge-Fund

March 9, 2026 virattt/ai-hedge-fund ↗

The AI-Hedge-Fund project centralizes rate-limit handling in src/tools/api.py using a unified _make_api_request helper that implements linear back-off retries for HTTP 429 responses, configurable retry ceilings, and graceful fall-through logic that works across any data provider.

Managing API quotas and rate limits is critical when building financial data pipelines that aggregate information from multiple sources. The virattt/ai-hedge-fund repository demonstrates a robust pattern for handling rate limiting and API quotas across different providers through a centralized request management system. This approach ensures consistent retry semantics while remaining flexible enough to accommodate provider-specific requirements.

Centralized Rate Limit Management in `src/tools/api.py`

The `_make_api_request` Helper Function

All external API interactions in the AI-Hedge-Fund project flow through the private helper _make_api_request located in src/tools/api.py. Public functions such as get_prices, get_financial_metrics, and get_insider_trades delegate their HTTP operations to this centralized handler, ensuring that rate limiting logic is implemented in exactly one place.

The helper supports both GET and POST methods through its method parameter, making it provider-agnostic regarding HTTP semantics.

Linear Back-Off Strategy for HTTP 429

When the helper encounters an HTTP 429 (Too Many Requests) response, it implements a linear back-off strategy defined in lines 43-55 of src/tools/api.py. The retry delay calculates as:


sleep_seconds = 60 + (30 × attempt_number)

This produces a first retry delay of 60 seconds, a second retry delay of 90 seconds, and so on. The loop attempts the request up to max_retries + 1 times before giving up.

Configurable Retry Logic

By default, the system attempts three retries (max_retries=3), but callers can override this value when invoking _make_api_request. If the final attempt still returns HTTP 429, the helper returns the final response to the caller rather than raising an exception. This design allows higher-level business logic to decide whether to abort the pipeline, log the error, or continue execution using cached or default data. Non-429 errors return immediately without retry attempts.

Implementing Provider-Specific Rate Limit Handling

Adding new data providers to the AI-Hedge-Fund system requires following a consistent pattern that leverages the centralized rate limiting infrastructure while accommodating provider-specific requirements.

Step 1: Create Provider-Specific Wrappers

Define a thin wrapper function that constructs the provider's URL, headers, and payload, then delegates to _make_api_request. This isolation keeps provider logic separate while reusing the shared throttling mechanism.

Step 2: Configure Authentication Headers

Pass provider-specific authentication tokens, API keys, or custom rate-limit headers through the headers dictionary. The _make_api_request helper treats headers as opaque, forwarding them unchanged to the HTTP client.

Step 3: Handle Provider-Specific Retry Headers

Some services expose Retry-After headers indicating exact wait times. Create a wrapper that reads this header and calls time.sleep with the specified value before invoking _make_api_request again, effectively layering provider-specific logic on top of the generic back-off.

Step 4: Implement Response Caching

Use the existing cache module at src/data/cache.py with a unique cache key that includes all request parameters. This prevents unnecessary API calls that waste quota, and the caching layer works uniformly for every provider.

Step 5: Test Rate Limit Behavior

Follow the pattern in tests/test_api_rate_limiting.py to emulate 429 responses and verify back-off timing. This guarantees that new wrappers honor the same retry semantics as the core system.

Code Examples

Fetching Prices with Automatic Rate Limit Handling

import os
from src.tools.api import get_prices

# The API key can be supplied via the env-var or directly

os.environ["FINANCIAL_DATASETS_API_KEY"] = "my-key"

prices = get_prices(
    ticker="AAPL",
    start_date="2024-01-01",
    end_date="2024-01-31",
)
print(prices[0].open, prices[0].close)

This call internally routes through _make_api_request, which automatically sleeps on HTTP 429 and retries up to three times using the linear back-off strategy.

Adding a New Provider (AlphaVantage Example)

from src.tools.api import _make_api_request

def get_alpha_vantage_prices(symbol: str, start: str, end: str, api_key: str) -> list:
    url = (
        f"https://www.alphavantage.co/query?"
        f"function=TIME_SERIES_DAILY_ADJUSTED&symbol={symbol}"
        f"&apikey={api_key}&outputsize=full&datatype=json"
    )
    headers = {}  # AlphaVantage uses query-string auth, no extra headers

    response = _make_api_request(url, headers, max_retries=5)
    if response.status_code != 200:
        return []  # Propagate failure to caller

    data = response.json()["Time Series (Daily)"]
    # Convert to list of Price objects (or any pydantic model)

    return data

Only the URL and payload construction are provider-specific; the retry and back-off logic is reused automatically from the centralized helper.

Respecting Retry-After Headers

import time
from src.tools.api import _make_api_request

def request_with_retry_after(url: str, headers: dict):
    for attempt in range(4):
        response = _make_api_request(url, headers, max_retries=0)  # disable internal retries

        if response.status_code != 429:
            return response
        # Provider gave us a precise wait time

        wait = int(response.headers.get("Retry-After", "60"))
        time.sleep(wait)
    return response

This wrapper demonstrates how to layer provider-specific Retry-After handling on top of the generic back-off infrastructure by disabling internal retries and managing the timing logic externally.

Testing Rate Limit Behavior

The project includes comprehensive unit tests in tests/test_api_rate_limiting.py that validate the throttling behavior. These tests mock HTTP 429 responses to verify that:

The linear back-off timing calculates correctly (60s, 90s, 120s, etc.)
The retry counter respects the max_retries parameter
POST requests handle rate limits identically to GET requests
The system returns the final 429 response rather than raising exceptions when retries are exhausted

When adding new providers, follow this testing pattern to ensure your integration honors the same retry semantics and error-handling contracts as the core system.

Summary

The AI-Hedge-Fund project consolidates rate-limit handling in src/tools/api.py through the _make_api_request helper, providing a single point of control for all external API interactions.
A linear back-off strategy (60s + 30s × attempt) automatically handles HTTP 429 responses with a default retry ceiling of three attempts, configurable per call via the max_retries parameter.
New providers integrate by wrapping _make_api_request, passing provider-specific headers, and optionally implementing custom logic for headers like Retry-After while reusing the core throttling infrastructure.
The caching layer at src/data/cache.py minimizes quota consumption uniformly across all providers by preventing redundant requests.
Comprehensive tests in tests/test_api_rate_limiting.py ensure consistent retry behavior as the system scales to support additional data sources.

Frequently Asked Questions

What happens when the maximum number of retries is exceeded?

When _make_api_request exhausts all retry attempts, it returns the final HTTP response to the caller rather than raising an exception. This design allows higher-level functions like get_prices or get_financial_metrics to implement their own failure handling strategies, whether that means aborting the pipeline, logging the error for manual review, or continuing execution using cached or default data.

How does the back-off calculation work for rate-limited requests?

The back-off follows a linear formula implemented in lines 43-55 of src/tools/api.py: sleep_seconds = 60 + (30 × attempt_number). The first retry waits 60 seconds, the second waits 90 seconds, and each subsequent attempt adds an additional 30 seconds. This linear progression balances giving the provider time to recover while preventing indefinite blocking of the execution pipeline.

Can I disable automatic retries for providers that don't use standard rate limiting?

Yes, you can disable the internal retry mechanism by passing max_retries=0 to _make_api_request. This is particularly useful when integrating providers that return non-standard status codes for throttling or when you want to implement custom retry logic—such as parsing a Retry-After header—at the wrapper level before deciding whether to invoke the request helper again.

Where should I add tests for new provider integrations?

New provider wrappers should follow the testing pattern established in tests/test_api_rate_limiting.py. This file contains unit tests that mock HTTP 429 responses and verify back-off timing, retry counts, and POST request handling. By emulating these test cases for your new provider, you ensure that your integration honors the same retry semantics and error-handling contracts as the core system.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how virattt/ai-hedge-fund works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →