# How ModelManager Implements Smart Model Selection with Fallback Logic in Open Notebook

> Discover how ModelManager in Open Notebook uses tiered fallback logic for smart model selection, prioritizing linked credentials and environment variables for robust AI integration.

- Repository: [Luis Novo/open-notebook](https://github.com/lfnovo/open-notebook)
- Tags: deep-dive
- Published: 2026-06-07

---

**ModelManager in [`open_notebook/ai/models.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/models.py) resolves AI models through a tiered fallback system that prioritizes linked credentials, falls back to environment variables, and gracefully returns `None` when defaults are missing.**

The `ModelManager` class in the open-notebook repository automates AI model resolution for workflows. By combining database-backed defaults with credential-aware provider configuration, it delivers smart model selection with fallback logic that keeps applications running even when user configuration is incomplete.

## ModelManager Architecture

`ModelManager` is defined in [[`open_notebook/ai/models.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/models.py)](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/models.py) and serves as the central service that determines which AI model a workflow should use. It orchestrates data from the `DefaultModels` singleton, the `Model` domain records, and the credential system to produce a ready-to-use Esperanto model instance, while SurrealDB access is handled through `repo_query` and `ensure_record_id` utilities in [[`open_notebook/database/repository.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/database/repository.py)](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/database/repository.py).

## The Smart Model Selection Pipeline

The smart selection behavior follows a coordinated pipeline that translates a high-level request into a concrete provider client.

### Loading Default Model IDs

When a workflow requests a model, `ModelManager.get_default_model()` reads the singleton `DefaultModels` record (lines 73–84 in [`open_notebook/ai/models.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/models.py)). This record stores IDs such as `default_chat_model`, `large_context_model`, and other category-specific defaults. Storing defaults in a single record allows the UI to expose one configurable default per model type without scattering configuration across multiple tables.

### Resolving the Concrete Model Record

`ModelManager.get_model()` fetches the `Model` object from SurrealDB by its record ID, using `repo_query` and `ensure_record_id` from [[`open_notebook/database/repository.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/database/repository.py)](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/database/repository.py) to handle database communication. During resolution, it validates that the stored `type` field is one of the supported categories: `language`, `embedding`, `speech_to_text`, or `text_to_speech`. This validation guarantees that only known model types are instantiated downstream.

### Building the Provider Configuration

Provider configuration follows a priority-based fallback chain. If the `Model` links to a `Credential`, `model.get_credential_obj()` loads the stored secret and calls `credential.to_esperanto_config()`—the credential-to-Esperanto mapping lives in [[`open_notebook/domain/credential.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/domain/credential.py)](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/domain/credential.py).

If no credential exists or loading fails, `ModelManager` calls `provision_provider_keys(model.provider)` from [[`open_notebook/ai/key_provider.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/key_provider.py)](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/key_provider.py) to pull the required API keys from environment variables. This credential-to-environment-variable cascade ensures the system remains functional even when users have not configured stored credentials.

### Instantiating the Esperanto Model

Once configuration is resolved, `ModelManager` delegates instantiation to the appropriate `AIFactory` method in [[`open_notebook/ai/provision.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/provision.py)](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/provision.py)—such as `create_language`, `create_embedding`, or `create_speech_to_text`—depending on `model.type`. The provider name is normalized via `model.provider.replace("_", "-")` before being passed to the factory.

Esperanto internally caches the instantiated model objects, so `ModelManager` does not need its own cache. Subsequent calls for the same model are cheap and avoid redundant database hits.

## Fallback Logic for Model Credentials and Defaults

The fallback strategies in `ModelManager` are designed to keep workflows running when configuration is incomplete. Each layer of the fallback chain has a specific responsibility, from secure credential storage to environment-based key provisioning.

### Credential to Environment Variable Cascade

The fallback logic follows a strict priority:

1. **Credential linked to model** — `model.get_credential_obj()` attempts to load the stored credential.
2. **Environment variables** — If the credential is missing or corrupt, `provision_provider_keys()` loads keys from environment variables.
3. **Raised error** — If both sources fail, a `ConfigurationError` is raised so the problem surfaces immediately rather than producing silent failures.

### Missing Default Handling

When a default model ID is absent from the `DefaultModels` record, the manager logs a clear warning via `logger.warning` and returns `None`. Callers such as the chat graph can then decide to use a hard-coded fallback or prompt the user to configure a default. This pattern prevents crashes during initial setup.

## Large-Context Override

Workloads that need to process very long prompts can request `ModelManager.get_default_model("large_context")`. The manager looks up the `large_context_model` field in the `DefaultModels` singleton. If that field is set, the large-context model supersedes the regular chat model. This enables automatic selection of a model with a bigger context window without adding conditional logic to individual workflows.

## Practical Usage Examples

The following examples demonstrate how to interact with `ModelManager` from application code. Each call is asynchronous and supports provider-specific overrides such as `temperature` and `max_tokens`.

```python
from open_notebook.ai.models import model_manager

# 1️⃣ Get the default chat model (uses the configured default, or large‑context if set)

chat_model = await model_manager.get_default_model("chat", temperature=0.7)

# 2️⃣ Get a specific model by its DB id, overriding any defaults

custom_model = await model_manager.get_model(
    "model:open_notebook:custom-llama-2",
    temperature=0.5,
    max_tokens=1024,
)

# 3️⃣ Directly request a speech‑to‑text model – falls back to env‑vars if no credential

stt = await model_manager.get_speech_to_text()

# 4️⃣ Retrieve the embedding model – returns None if no default is configured

emb = await model_manager.get_embedding_model()
if emb is None:
    raise RuntimeError("Embedding model not configured")

```

## Summary

- `ModelManager` centralizes smart model selection with fallback logic in [`open_notebook/ai/models.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/models.py) by reading the `DefaultModels` singleton and resolving concrete `Model` records from SurrealDB.
- The credential-to-environment-variable cascade in [`open_notebook/ai/key_provider.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/key_provider.py) and [`open_notebook/domain/credential.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/domain/credential.py) ensures provider keys are discovered even when stored credentials are absent.
- Model types are validated against supported categories before instantiation, preventing invalid provider configurations.
- Esperanto caches instances internally, so `ModelManager` remains stateless while still delivering fast repeated access.
- Missing defaults return `None` with a logged warning, letting callers decide how to handle incomplete configuration.

## Frequently Asked Questions

### What happens if a model has no credential and no environment variables are set?

If the model lacks a linked credential and `provision_provider_keys()` cannot find the required environment variables, `ModelManager` raises a `ConfigurationError`. This early failure surfaces the missing configuration before any inference request is attempted.

### How does ModelManager choose between a default chat model and a large-context model?

When a workflow calls `get_default_model("large_context")`, the manager checks the `large_context_model` field in the `DefaultModels` record. If that field contains a valid model ID, it supersedes the standard chat default. Workloads that do not request the large-context variant continue to receive the regular `default_chat_model`.

### Where does the actual API key priority logic live?

The priority logic is split across two files. Credential loading and translation to Esperanto configuration happen in [`open_notebook/domain/credential.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/domain/credential.py) via `to_esperanto_config()`. The environment-variable fallback is implemented in [`open_notebook/ai/key_provider.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/key_provider.py) via `provision_provider_keys()`. `ModelManager` orchestrates both paths in [`open_notebook/ai/models.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/models.py).

### Does ModelManager cache model instances itself?

No. `ModelManager` relies on Esperanto's internal caching mechanism. After the first instantiation through `AIFactory` in [`open_notebook/ai/provision.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/provision.py), subsequent requests for the same model reuse the cached object without additional database queries or factory calls.