# How to Configure Custom Models for Ollama or LiteLLM in pi-ai

> Configure custom Ollama or LiteLLM models in pi-ai effortlessly. Create a models.json file to define providers and model IDs, loading them instantly without restarts.

- Repository: [Mario Zechner/pi-mono](https://github.com/badlogic/pi-mono)
- Tags: how-to-guide
- Published: 2026-02-16

---

**You configure custom Ollama or LiteLLM models in pi-ai by creating a `~/.pi/agent/models.json` file that defines provider types, base URLs, and model IDs, which the `CustomProvidersStore` automatically loads without requiring a restart.**

The `badlogic/pi-mono` repository provides pi-ai with a flexible model discovery system that treats any OpenAI-compatible endpoint—including local Ollama servers, vLLM instances, or LiteLLM proxies—as a first-class provider. This configuration-driven approach lets you expose custom models in the UI by editing a single JSON file.

## Configuration File Location and Structure

All custom provider definitions reside in `~/.pi/agent/models.json`. This file is watched by the `CustomProvidersStore` located at [`packages/web-ui/src/storage/stores/custom-providers-store.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/web-ui/src/storage/stores/custom-providers-store.ts), which means changes are applied immediately without restarting the application.

The top-level structure requires a `providers` object containing one or more provider configurations:

```json
{
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "api": "openai-completions",
      "apiKey": "placeholder",
      "models": []
    }
  }
}

```

Each provider entry requires four fields to satisfy the schema validated by the discovery layer:

- **`baseUrl`** – The root endpoint of your LLM server (must include `/v1` for OpenAI-compatible APIs)
- **`api`** – The API format, typically `"openai-completions"` for Ollama, vLLM, and LiteLLM
- **`apiKey`** – Required by the schema but ignored by Ollama; any string works
- **`models`** – An array of model objects or simple ID strings

## Minimal Ollama Configuration

For a standard Ollama installation running locally, you only need to specify the model IDs you want to expose. The `discoverOllamaModels` function in [`packages/web-ui/src/utils/model-discovery.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/web-ui/src/utils/model-discovery.ts) automatically queries `http://localhost:11434/api/tags` to validate that the models support tool calling.

Create `~/.pi/agent/models.json` with the following content:

```json
{
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "api": "openai-completions",
      "apiKey": "ollama",
      "models": [
        { "id": "llama3.1:8b" },
        { "id": "qwen2.5-coder:7b" }
      ]
    }
  }
}

```

Once saved, open the **/model** view in the pi-ai interface. The `ModelSelector` component in [`packages/web-ui/src/dialogs/ModelSelector.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/web-ui/src/dialogs/ModelSelector.ts) merges these entries with built-in providers, making `llama3.1:8b` and `qwen2.5-coder:7b` selectable options.

## Full Model Configuration Options

To override default discovery behavior or customize how a model appears in the UI, specify the complete model object structure. The discovery code in [`model-discovery.ts`](https://github.com/badlogic/pi-mono/blob/main/model-discovery.ts) uses these values to construct the `Model` objects consumed by the agent.

```json
{
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "api": "openai-completions",
      "apiKey": "ollama",
      "models": [
        {
          "id": "llama3.1:8b",
          "name": "Llama 3.1 8B (local)",
          "reasoning": false,
          "input": ["text"],
          "contextWindow": 128000,
          "maxTokens": 32000,
          "cost": {
            "input": 0,
            "output": 0,
            "cacheRead": 0,
            "cacheWrite": 0
          }
        }
      ]
    }
  }
}

```

**Key fields explained:**

- **`id`** – The exact model identifier passed to the server endpoint (e.g., `llama3.1:8b`)
- **`name`** – The display label shown in the model selector; defaults to the ID if omitted
- **`reasoning`** – Boolean flag enabling the thinking stream for models that support structured reasoning
- **`input`** – Array of supported input types: `["text"]` or `["text", "image"]` for multimodal models
- **`contextWindow`** – Maximum token context the model can process; discovery defaults to 128,000 for Ollama
- **`maxTokens`** – Upper limit for generated tokens in a single response
- **`cost`** – Pricing per million tokens; use `0` for all fields when running local models

## Configuring LiteLLM and vLLM Endpoints

For LiteLLM proxies, vLLM servers, or other OpenAI-compatible local inference engines, use the same configuration pattern with adjusted provider keys and base URLs. The `discoverVLLMModels` function specifically maps `max_model_len` to the `contextWindow` and applies token caps based on the server's reported limits.

To connect a vLLM instance hosting a lightweight model like Gemini Flash Lite:

```json
{
  "providers": {
    "vllm": {
      "baseUrl": "http://localhost:8000/v1",
      "api": "openai-completions",
      "apiKey": "vllm",
      "models": [
        { "id": "gemini-2.0-flash-lite" }
      ]
    }
  }
}

```

LiteLLM proxies expose an OpenAI-compatible API at a specific port, so you would configure:

```json
{
  "providers": {
    "litellm": {
      "baseUrl": "http://localhost:4000/v1",
      "api": "openai-completions",
      "apiKey": "sk-your-lite-llm-key",
      "models": [
        { "id": "gpt-4o-mini" },
        { "id": "claude-3-haiku" }
      ]
    }
  }
}

```

The discovery layer treats all these identically, querying the `/v1/models` endpoint (or Ollama-specific endpoints) to validate availability and capabilities.

## How Model Discovery Works

The discovery pipeline centers on [`packages/web-ui/src/utils/model-discovery.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/web-ui/src/utils/model-discovery.ts), which exports provider-specific discovery functions:

- **`discoverOllamaModels(baseUrl)`** – Uses the `ollama/browser` SDK to fetch available tags, filters for models supporting the `tools` capability, and applies a 10× multiplier to calculate the `contextWindow` based on reported parameters
- **`discoverVLLMModels(baseUrl)`** – Queries the vLLM OpenAI-compatible model endpoint and maps `max_model_len` to context window constraints
- **`discoverLlamaCppModels`** and **`discoverLMStudioModels`** – Handle GGUF-based local servers

When `CustomProvidersStore.reload()` detects a file change in `~/.pi/agent/models.json`, it triggers `discoverModels()` for each configured provider. The results merge with built-in models in the `ModelSelector` UI component at [`packages/web-ui/src/dialogs/ModelSelector.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/web-ui/src/dialogs/ModelSelector.ts).

## Step-by-Step Setup and Verification

**1. Create the configuration directory and file:**

```bash
mkdir -p ~/.pi/agent
cat > ~/.pi/agent/models.json <<'EOF'
{
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "api": "openai-completions",
      "apiKey": "ollama",
      "models": [
        { "id": "llama3.1:8b" },
        { "id": "qwen2.5-coder:7b" }
      ]
    }
  }
}
EOF

```

**2. Verify discovery programmatically:**

If you need to test the discovery logic outside the UI, import the discovery function directly from the source:

```typescript
import { discoverOllamaModels } from "./packages/web-ui/src/utils/model-discovery.js";

(async () => {
  const models = await discoverOllamaModels("http://localhost:11434");
  console.log("Discovered models:", models.map(m => m.id));
})();

```

**3. Force reload via the internal API:**

While the file watcher handles most updates, you can manually trigger a reload in a Node environment:

```typescript
import { CustomProvidersStore } from "./packages/web-ui/src/storage/stores/custom-providers-store.js";

await new CustomProvidersStore().reload();

```

**4. Confirm in the UI:**

Navigate to the **/model** tab. The selector should display your custom models alongside built-in options like GPT-4o and Claude. If a model is missing, check that the Ollama server is running and that the model ID matches exactly what `ollama list` returns.

## Summary

- **Configuration location**: Store all custom provider settings in `~/.pi/agent/models.json`
- **Provider flexibility**: Use `"ollama"`, `"vllm"`, `"llama.cpp"`, `"lmstudio"`, or custom keys for LiteLLM proxies
- **Required fields**: Every provider needs `baseUrl`, `api` (typically `"openai-completions"`), `apiKey` (placeholder for local servers), and a `models` array
- **Automatic reload**: The `CustomProvidersStore` watches the file and updates the `ModelSelector` UI without requiring an application restart
- **Deep configuration**: Override `contextWindow`, `maxTokens`, `reasoning` flags, and `cost` structures to customize how local models behave compared to cloud APIs

## Frequently Asked Questions

### Does pi-ai support LiteLLM as a provider?

Yes, LiteLLM exposes an OpenAI-compatible API that works with the `"openai-completions"` API type. Configure it like any other provider by setting the `baseUrl` to your LiteLLM proxy address (e.g., `http://localhost:4000/v1`) and providing the actual API key issued by your LiteLLM instance. The discovery layer will query the LiteLLM `/v1/models` endpoint to populate the selector.

### Why does Ollama require an API key field if it doesn't use authentication?

The [`models.json`](https://github.com/badlogic/pi-mono/blob/main/models.json) schema requires an `apiKey` field for all providers to maintain type consistency across cloud and local services. For Ollama, you can enter any placeholder string such as `"ollama"` or `"none"`. The `discoverOllamaModels` function does not send this value to the local server, as Ollama does not implement API key validation.

### How do I configure multimodal inputs for local vision models?

Set the `input` array to `["text", "image"]` in your model configuration. When the `ModelSelector` in [`packages/web-ui/src/dialogs/ModelSelector.ts`](https://github.com/badlogic/pi-mono/blob/main/packages/web-ui/src/dialogs/ModelSelector.ts) builds the model list, it checks this property to determine whether to enable image upload buttons in the chat interface. Ensure your local server (Ollama with LLaVA, or vLLM with vision support) actually serves the model with image processing capabilities.

### What happens if the local server is offline when pi-ai starts?

The `discoverOllamaModels` and related functions catch connection errors gracefully. If the server is unreachable, the discovery layer returns an empty array for that provider, and the models will not appear in the selector. Once the server comes online, the file watcher detects no changes, so you must either touch the [`models.json`](https://github.com/badlogic/pi-mono/blob/main/models.json) file or use the `CustomProvidersStore.reload()` method to trigger a fresh discovery cycle.