How to Configure Custom Models for Ollama or LiteLLM in pi-ai
You configure custom Ollama or LiteLLM models in pi-ai by creating a ~/.pi/agent/models.json file that defines provider types, base URLs, and model IDs, which the CustomProvidersStore automatically loads without requiring a restart.
The badlogic/pi-mono repository provides pi-ai with a flexible model discovery system that treats any OpenAI-compatible endpoint—including local Ollama servers, vLLM instances, or LiteLLM proxies—as a first-class provider. This configuration-driven approach lets you expose custom models in the UI by editing a single JSON file.
Configuration File Location and Structure
All custom provider definitions reside in ~/.pi/agent/models.json. This file is watched by the CustomProvidersStore located at packages/web-ui/src/storage/stores/custom-providers-store.ts, which means changes are applied immediately without restarting the application.
The top-level structure requires a providers object containing one or more provider configurations:
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "placeholder",
"models": []
}
}
}
Each provider entry requires four fields to satisfy the schema validated by the discovery layer:
baseUrl– The root endpoint of your LLM server (must include/v1for OpenAI-compatible APIs)api– The API format, typically"openai-completions"for Ollama, vLLM, and LiteLLMapiKey– Required by the schema but ignored by Ollama; any string worksmodels– An array of model objects or simple ID strings
Minimal Ollama Configuration
For a standard Ollama installation running locally, you only need to specify the model IDs you want to expose. The discoverOllamaModels function in packages/web-ui/src/utils/model-discovery.ts automatically queries http://localhost:11434/api/tags to validate that the models support tool calling.
Create ~/.pi/agent/models.json with the following content:
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"models": [
{ "id": "llama3.1:8b" },
{ "id": "qwen2.5-coder:7b" }
]
}
}
}
Once saved, open the /model view in the pi-ai interface. The ModelSelector component in packages/web-ui/src/dialogs/ModelSelector.ts merges these entries with built-in providers, making llama3.1:8b and qwen2.5-coder:7b selectable options.
Full Model Configuration Options
To override default discovery behavior or customize how a model appears in the UI, specify the complete model object structure. The discovery code in model-discovery.ts uses these values to construct the Model objects consumed by the agent.
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"models": [
{
"id": "llama3.1:8b",
"name": "Llama 3.1 8B (local)",
"reasoning": false,
"input": ["text"],
"contextWindow": 128000,
"maxTokens": 32000,
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
}
}
]
}
}
}
Key fields explained:
id– The exact model identifier passed to the server endpoint (e.g.,llama3.1:8b)name– The display label shown in the model selector; defaults to the ID if omittedreasoning– Boolean flag enabling the thinking stream for models that support structured reasoninginput– Array of supported input types:["text"]or["text", "image"]for multimodal modelscontextWindow– Maximum token context the model can process; discovery defaults to 128,000 for OllamamaxTokens– Upper limit for generated tokens in a single responsecost– Pricing per million tokens; use0for all fields when running local models
Configuring LiteLLM and vLLM Endpoints
For LiteLLM proxies, vLLM servers, or other OpenAI-compatible local inference engines, use the same configuration pattern with adjusted provider keys and base URLs. The discoverVLLMModels function specifically maps max_model_len to the contextWindow and applies token caps based on the server's reported limits.
To connect a vLLM instance hosting a lightweight model like Gemini Flash Lite:
{
"providers": {
"vllm": {
"baseUrl": "http://localhost:8000/v1",
"api": "openai-completions",
"apiKey": "vllm",
"models": [
{ "id": "gemini-2.0-flash-lite" }
]
}
}
}
LiteLLM proxies expose an OpenAI-compatible API at a specific port, so you would configure:
{
"providers": {
"litellm": {
"baseUrl": "http://localhost:4000/v1",
"api": "openai-completions",
"apiKey": "sk-your-lite-llm-key",
"models": [
{ "id": "gpt-4o-mini" },
{ "id": "claude-3-haiku" }
]
}
}
}
The discovery layer treats all these identically, querying the /v1/models endpoint (or Ollama-specific endpoints) to validate availability and capabilities.
How Model Discovery Works
The discovery pipeline centers on packages/web-ui/src/utils/model-discovery.ts, which exports provider-specific discovery functions:
discoverOllamaModels(baseUrl)– Uses theollama/browserSDK to fetch available tags, filters for models supporting thetoolscapability, and applies a 10× multiplier to calculate thecontextWindowbased on reported parametersdiscoverVLLMModels(baseUrl)– Queries the vLLM OpenAI-compatible model endpoint and mapsmax_model_lento context window constraintsdiscoverLlamaCppModelsanddiscoverLMStudioModels– Handle GGUF-based local servers
When CustomProvidersStore.reload() detects a file change in ~/.pi/agent/models.json, it triggers discoverModels() for each configured provider. The results merge with built-in models in the ModelSelector UI component at packages/web-ui/src/dialogs/ModelSelector.ts.
Step-by-Step Setup and Verification
1. Create the configuration directory and file:
mkdir -p ~/.pi/agent
cat > ~/.pi/agent/models.json <<'EOF'
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"models": [
{ "id": "llama3.1:8b" },
{ "id": "qwen2.5-coder:7b" }
]
}
}
}
EOF
2. Verify discovery programmatically:
If you need to test the discovery logic outside the UI, import the discovery function directly from the source:
import { discoverOllamaModels } from "./packages/web-ui/src/utils/model-discovery.js";
(async () => {
const models = await discoverOllamaModels("http://localhost:11434");
console.log("Discovered models:", models.map(m => m.id));
})();
3. Force reload via the internal API:
While the file watcher handles most updates, you can manually trigger a reload in a Node environment:
import { CustomProvidersStore } from "./packages/web-ui/src/storage/stores/custom-providers-store.js";
await new CustomProvidersStore().reload();
4. Confirm in the UI:
Navigate to the /model tab. The selector should display your custom models alongside built-in options like GPT-4o and Claude. If a model is missing, check that the Ollama server is running and that the model ID matches exactly what ollama list returns.
Summary
- Configuration location: Store all custom provider settings in
~/.pi/agent/models.json - Provider flexibility: Use
"ollama","vllm","llama.cpp","lmstudio", or custom keys for LiteLLM proxies - Required fields: Every provider needs
baseUrl,api(typically"openai-completions"),apiKey(placeholder for local servers), and amodelsarray - Automatic reload: The
CustomProvidersStorewatches the file and updates theModelSelectorUI without requiring an application restart - Deep configuration: Override
contextWindow,maxTokens,reasoningflags, andcoststructures to customize how local models behave compared to cloud APIs
Frequently Asked Questions
Does pi-ai support LiteLLM as a provider?
Yes, LiteLLM exposes an OpenAI-compatible API that works with the "openai-completions" API type. Configure it like any other provider by setting the baseUrl to your LiteLLM proxy address (e.g., http://localhost:4000/v1) and providing the actual API key issued by your LiteLLM instance. The discovery layer will query the LiteLLM /v1/models endpoint to populate the selector.
Why does Ollama require an API key field if it doesn't use authentication?
The models.json schema requires an apiKey field for all providers to maintain type consistency across cloud and local services. For Ollama, you can enter any placeholder string such as "ollama" or "none". The discoverOllamaModels function does not send this value to the local server, as Ollama does not implement API key validation.
How do I configure multimodal inputs for local vision models?
Set the input array to ["text", "image"] in your model configuration. When the ModelSelector in packages/web-ui/src/dialogs/ModelSelector.ts builds the model list, it checks this property to determine whether to enable image upload buttons in the chat interface. Ensure your local server (Ollama with LLaVA, or vLLM with vision support) actually serves the model with image processing capabilities.
What happens if the local server is offline when pi-ai starts?
The discoverOllamaModels and related functions catch connection errors gracefully. If the server is unreachable, the discovery layer returns an empty array for that provider, and the models will not appear in the selector. Once the server comes online, the file watcher detects no changes, so you must either touch the models.json file or use the CustomProvidersStore.reload() method to trigger a fresh discovery cycle.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →