Ollama vs Cloud LLM APIs for ai-hedge-fund: Architecture and Implementation Guide

The ai-hedge-fund repository abstracts LLM interactions through a unified chat() interface that dynamically routes requests to either a local Ollama instance or cloud APIs like OpenAI based on the LLM_BACKEND environment variable.

The ai-hedge-fund project supports both local and cloud-based language model inference, allowing developers to run financial analysis agents on-premises using Ollama or scale to production using cloud LLM APIs. This architectural flexibility ensures that trading agents remain agnostic to the underlying model provider while giving operators full control over latency, cost, and data privacy.

Architecture Overview

The codebase implements a clean separation between LLM backend logic and agent business logic. All LLM interactions flow through a single entry point in src/utils/llm.py, which delegates to provider-specific implementations based on runtime configuration.

Backend Selection Logic

The system determines which provider to use by reading the LLM_BACKEND environment variable at module load time. In src/utils/llm.py, the implementation defaults to "ollama" if the variable is unset:


# src/utils/llm.py

LLM_BACKEND = os.getenv("LLM_BACKEND", "ollama")
OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")
OPENAI_API_URL = os.getenv("OPENAI_API_URL", "https://api.openai.com/v1")

The public chat() function acts as a router, selecting between _ollama_chat() and _openai_chat() internal methods:

def chat(messages: List[Dict[str, str]]) -> Dict[str, Any]:
    """Unified chat interface that selects the backend dynamically."""
    if LLM_BACKEND == "ollama":
        return _ollama_chat(messages)
    elif LLM_BACKEND == "openai":
        return _openai_chat(messages)
    else:
        raise ValueError(f"Unsupported LLM_BACKEND: {LLM_BACKEND}")

This design allows agents in src/agents/ to call chat() without modification regardless of the deployment environment.

Local Ollama Implementation

When LLM_BACKEND=ollama, the system communicates with a locally running Ollama server via HTTP requests to the Ollama API.

Ollama Lifecycle Management

The src/utils/ollama.py module provides comprehensive lifecycle management for local models. It handles installation detection, server startup, and model downloading through functions like is_ollama_installed(), start_ollama_server(), and ensure_ollama_and_model().

The ensure_ollama_and_model() function coordinates the entire setup process, checking for the Ollama binary, starting the server if necessary, and downloading missing models:


# Conceptual representation based on src/utils/ollama.py implementation

def ensure_ollama_and_model(model_name: str = "llama2"):
    if not is_ollama_installed():
        raise RuntimeError("Ollama not found")
    if not is_server_running():
        start_ollama_server()
    if model_name not in get_locally_available_models():
        download_model(model_name)

The _ollama_chat() function in src/utils/llm.py constructs the request payload and posts to the local Ollama endpoint:

def _ollama_chat(messages: List[Dict[str, str]]) -> Dict[str, Any]:
    url = f"{OLLAMA_BASE_URL}/api/chat"
    payload = {"model": os.getenv("OLLAMA_MODEL", "llama2"), "messages": messages}
    response = requests.post(url, json=payload, timeout=60)
    response.raise_for_status()
    return response.json()

Cloud API Implementation

For production deployments, the system supports cloud LLM providers through the same unified interface.

OpenAI Integration

When LLM_BACKEND=openai, requests route to the _openai_chat() function in src/utils/llm.py. This implementation retrieves API keys via src/utils/api_key.py and constructs authenticated requests to the OpenAI API:

def _openai_chat(messages: List[Dict[str, str]]) -> Dict[str, Any]:
    url = f"{OPENAI_API_URL}/chat/completions"
    headers = {
        "Authorization": f"Bearer {OPENAI_API_KEY}", 
        "Content-Type": "application/json"
    }
    payload = {
        "model": os.getenv("OPENAI_MODEL", "gpt-4o-mini"), 
        "messages": messages
    }
    response = requests.post(url, headers=headers, json=payload, timeout=60)
    response.raise_for_status()
    return response.json()

The API key is retrieved securely through the centralized key management system in src/utils/api_key.py, separating sensitive credentials from business logic.

FastAPI Service Layer

The app/backend/services/ollama_service.py file exposes Ollama operations via HTTP endpoints, wrapping the utilities from src/utils/ollama.py for remote management. This allows containerized deployments to check model availability or trigger downloads through REST APIs rather than direct CLI access.

Configuration and Usage Examples

Switching Between Backends

To use Ollama locally (default behavior):

export LLM_BACKEND=ollama
export OLLAMA_MODEL=mistral
export OLLAMA_BASE_URL=http://localhost:11434
from src.utils.llm import chat

response = chat([{"role": "user", "content": "Analyze AAPL valuation"}])
print(response["message"]["content"])

To use OpenAI cloud API:

export LLM_BACKEND=openai
export OPENAI_MODEL=gpt-4o-mini
export OPENAI_API_KEY=your_key_here  # Or use api_key.py storage
from src.utils.llm import chat

# Same interface, different backend

result = chat([{"role": "user", "content": "Generate risk assessment"}])
print(result["choices"][0]["message"]["content"])

Model Management with Ollama

For local deployments, ensure required models are available before running agents:

from src.utils.ollama import ensure_ollama_and_model

# Downloads model if missing, starts server if needed

ensure_ollama_and_model("llama3")

Summary

  • Unified Interface: The chat() function in src/utils/llm.py provides a single entry point for all LLM interactions, routing to either Ollama or OpenAI based on the LLM_BACKEND environment variable.
  • Local Flexibility: Ollama integration in src/utils/ollama.py supports full lifecycle management including installation detection, server startup, and model downloading.
  • Cloud Scalability: OpenAI integration uses standard HTTP requests with authentication via src/utils/api_key.py, supporting production deployments without code changes.
  • Agent Agnosticism: Trading agents in src/agents/ call the unified interface, remaining decoupled from specific LLM providers.
  • Service Exposure: FastAPI endpoints in app/backend/services/ollama_service.py wrap local Ollama operations for containerized environments.

Frequently Asked Questions

How does ai-hedge-fund decide which LLM provider to use?

The system checks the LLM_BACKEND environment variable at runtime in src/utils/llm.py. If set to "ollama", it routes requests to a local server; if "openai", it sends requests to the cloud API. The default value is "ollama", ensuring local-first operation when no configuration is provided.

Can I run ai-hedge-fund completely offline?

Yes. By setting LLM_BACKEND=ollama and ensuring your models are pre-downloaded using ensure_ollama_and_model(), the entire system operates without internet connectivity. All inference happens locally via the Ollama server running at OLLAMA_BASE_URL (default http://localhost:11434).

What changes are required to switch from Ollama to OpenAI?

No code changes are required within agents. Simply set the environment variables LLM_BACKEND=openai and provide your API key through OPENAI_API_KEY or src/utils/api_key.py. The chat() function automatically switches to _openai_chat(), changing the request URL from your local Ollama instance to https://api.openai.com/v1/chat/completions.

Where is the API key stored for cloud LLM access?

API keys are managed through src/utils/api_key.py, which provides a centralized get_api_key() function. For OpenAI, the key is retrieved via api_key.get_api_key("openai") and injected into the Authorization header as a Bearer token in src/utils/llm.py.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →