# Ollama vs Cloud LLM APIs for ai-hedge-fund: Architecture and Implementation Guide

> Compare Ollama vs cloud LLM APIs for your ai-hedge-fund project. This guide details their architecture and implementation, helping you choose the best LLM backend for your needs.

- Repository: [Virat Singh/ai-hedge-fund](https://github.com/virattt/ai-hedge-fund)
- Tags: architecture
- Published: 2026-03-09

---

**The ai-hedge-fund repository abstracts LLM interactions through a unified `chat()` interface that dynamically routes requests to either a local Ollama instance or cloud APIs like OpenAI based on the `LLM_BACKEND` environment variable.**

The **ai-hedge-fund** project supports both local and cloud-based language model inference, allowing developers to run financial analysis agents on-premises using Ollama or scale to production using cloud LLM APIs. This architectural flexibility ensures that trading agents remain agnostic to the underlying model provider while giving operators full control over latency, cost, and data privacy.

## Architecture Overview

The codebase implements a clean separation between LLM backend logic and agent business logic. All LLM interactions flow through a single entry point in [`src/utils/llm.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/llm.py), which delegates to provider-specific implementations based on runtime configuration.

### Backend Selection Logic

The system determines which provider to use by reading the `LLM_BACKEND` environment variable at module load time. In [`src/utils/llm.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/llm.py), the implementation defaults to `"ollama"` if the variable is unset:

```python

# src/utils/llm.py

LLM_BACKEND = os.getenv("LLM_BACKEND", "ollama")
OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")
OPENAI_API_URL = os.getenv("OPENAI_API_URL", "https://api.openai.com/v1")

```

The public `chat()` function acts as a router, selecting between `_ollama_chat()` and `_openai_chat()` internal methods:

```python
def chat(messages: List[Dict[str, str]]) -> Dict[str, Any]:
    """Unified chat interface that selects the backend dynamically."""
    if LLM_BACKEND == "ollama":
        return _ollama_chat(messages)
    elif LLM_BACKEND == "openai":
        return _openai_chat(messages)
    else:
        raise ValueError(f"Unsupported LLM_BACKEND: {LLM_BACKEND}")

```

This design allows agents in `src/agents/` to call `chat()` without modification regardless of the deployment environment.

## Local Ollama Implementation

When `LLM_BACKEND=ollama`, the system communicates with a locally running Ollama server via HTTP requests to the Ollama API.

### Ollama Lifecycle Management

The [`src/utils/ollama.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/ollama.py) module provides comprehensive lifecycle management for local models. It handles installation detection, server startup, and model downloading through functions like `is_ollama_installed()`, `start_ollama_server()`, and `ensure_ollama_and_model()`.

The `ensure_ollama_and_model()` function coordinates the entire setup process, checking for the Ollama binary, starting the server if necessary, and downloading missing models:

```python

# Conceptual representation based on src/utils/ollama.py implementation

def ensure_ollama_and_model(model_name: str = "llama2"):
    if not is_ollama_installed():
        raise RuntimeError("Ollama not found")
    if not is_server_running():
        start_ollama_server()
    if model_name not in get_locally_available_models():
        download_model(model_name)

```

The `_ollama_chat()` function in [`src/utils/llm.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/llm.py) constructs the request payload and posts to the local Ollama endpoint:

```python
def _ollama_chat(messages: List[Dict[str, str]]) -> Dict[str, Any]:
    url = f"{OLLAMA_BASE_URL}/api/chat"
    payload = {"model": os.getenv("OLLAMA_MODEL", "llama2"), "messages": messages}
    response = requests.post(url, json=payload, timeout=60)
    response.raise_for_status()
    return response.json()

```

## Cloud API Implementation

For production deployments, the system supports cloud LLM providers through the same unified interface.

### OpenAI Integration

When `LLM_BACKEND=openai`, requests route to the `_openai_chat()` function in [`src/utils/llm.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/llm.py). This implementation retrieves API keys via [`src/utils/api_key.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/api_key.py) and constructs authenticated requests to the OpenAI API:

```python
def _openai_chat(messages: List[Dict[str, str]]) -> Dict[str, Any]:
    url = f"{OPENAI_API_URL}/chat/completions"
    headers = {
        "Authorization": f"Bearer {OPENAI_API_KEY}", 
        "Content-Type": "application/json"
    }
    payload = {
        "model": os.getenv("OPENAI_MODEL", "gpt-4o-mini"), 
        "messages": messages
    }
    response = requests.post(url, headers=headers, json=payload, timeout=60)
    response.raise_for_status()
    return response.json()

```

The API key is retrieved securely through the centralized key management system in [`src/utils/api_key.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/api_key.py), separating sensitive credentials from business logic.

### FastAPI Service Layer

The [`app/backend/services/ollama_service.py`](https://github.com/virattt/ai-hedge-fund/blob/main/app/backend/services/ollama_service.py) file exposes Ollama operations via HTTP endpoints, wrapping the utilities from [`src/utils/ollama.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/ollama.py) for remote management. This allows containerized deployments to check model availability or trigger downloads through REST APIs rather than direct CLI access.

## Configuration and Usage Examples

### Switching Between Backends

To use **Ollama locally** (default behavior):

```bash
export LLM_BACKEND=ollama
export OLLAMA_MODEL=mistral
export OLLAMA_BASE_URL=http://localhost:11434

```

```python
from src.utils.llm import chat

response = chat([{"role": "user", "content": "Analyze AAPL valuation"}])
print(response["message"]["content"])

```

To use **OpenAI cloud API**:

```bash
export LLM_BACKEND=openai
export OPENAI_MODEL=gpt-4o-mini
export OPENAI_API_KEY=your_key_here  # Or use api_key.py storage

```

```python
from src.utils.llm import chat

# Same interface, different backend

result = chat([{"role": "user", "content": "Generate risk assessment"}])
print(result["choices"][0]["message"]["content"])

```

### Model Management with Ollama

For local deployments, ensure required models are available before running agents:

```python
from src.utils.ollama import ensure_ollama_and_model

# Downloads model if missing, starts server if needed

ensure_ollama_and_model("llama3")

```

## Summary

- **Unified Interface**: The `chat()` function in [`src/utils/llm.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/llm.py) provides a single entry point for all LLM interactions, routing to either Ollama or OpenAI based on the `LLM_BACKEND` environment variable.
- **Local Flexibility**: Ollama integration in [`src/utils/ollama.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/ollama.py) supports full lifecycle management including installation detection, server startup, and model downloading.
- **Cloud Scalability**: OpenAI integration uses standard HTTP requests with authentication via [`src/utils/api_key.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/api_key.py), supporting production deployments without code changes.
- **Agent Agnosticism**: Trading agents in `src/agents/` call the unified interface, remaining decoupled from specific LLM providers.
- **Service Exposure**: FastAPI endpoints in [`app/backend/services/ollama_service.py`](https://github.com/virattt/ai-hedge-fund/blob/main/app/backend/services/ollama_service.py) wrap local Ollama operations for containerized environments.

## Frequently Asked Questions

### How does ai-hedge-fund decide which LLM provider to use?

The system checks the `LLM_BACKEND` environment variable at runtime in [`src/utils/llm.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/llm.py). If set to `"ollama"`, it routes requests to a local server; if `"openai"`, it sends requests to the cloud API. The default value is `"ollama"`, ensuring local-first operation when no configuration is provided.

### Can I run ai-hedge-fund completely offline?

Yes. By setting `LLM_BACKEND=ollama` and ensuring your models are pre-downloaded using `ensure_ollama_and_model()`, the entire system operates without internet connectivity. All inference happens locally via the Ollama server running at `OLLAMA_BASE_URL` (default `http://localhost:11434`).

### What changes are required to switch from Ollama to OpenAI?

No code changes are required within agents. Simply set the environment variables `LLM_BACKEND=openai` and provide your API key through `OPENAI_API_KEY` or [`src/utils/api_key.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/api_key.py). The `chat()` function automatically switches to `_openai_chat()`, changing the request URL from your local Ollama instance to `https://api.openai.com/v1/chat/completions`.

### Where is the API key stored for cloud LLM access?

API keys are managed through [`src/utils/api_key.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/api_key.py), which provides a centralized `get_api_key()` function. For OpenAI, the key is retrieved via `api_key.get_api_key("openai")` and injected into the Authorization header as a Bearer token in [`src/utils/llm.py`](https://github.com/virattt/ai-hedge-fund/blob/main/src/utils/llm.py).