# How ModelManager Provisions Language, Embedding, and Speech Models in Open-Notebook

> Discover how ModelManager provisions language, embedding, and speech models in Open-Notebook. Learn to transform database records into live AI objects, resolving credentials and normalizing providers.

- Repository: [Luis Novo/open-notebook](https://github.com/lfnovo/open-notebook)
- Tags: how-to-guide
- Published: 2026-06-07

---

**ModelManager transforms database records from the `model` table into live AI objects by resolving credentials from either stored records or environment variables, normalizing provider names, and delegating construction to Esperanto's AIFactory.**

Open-Notebook stores AI model definitions in SurrealDB and uses the `ModelManager` class to bridge static configuration with runtime execution. This architecture separates provider credentials and model metadata from application logic, enabling dynamic provisioning of language, embedding, and speech-to-text capabilities.

## The ModelManager Provisioning Flow

The provisioning process in [`open_notebook/ai/models.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/models.py) follows a six-step pipeline that converts database records into ready-to-use model instances.

### Step 1: Database Record Retrieval

Provisioning begins when `ModelManager.get_model()` fetches the model metadata using its SurrealDB identifier. The method queries the `Model` table via `await Model.get(model_id)` to retrieve provider configuration, model type, and optional credential references.

### Step 2: Type Validation and Credential Resolution

The manager validates that the record's `type` field contains one of the supported values: `language`, `embedding`, `speech_to_text`, or `text_to_speech`. Credentials resolve through one of two paths:

**Primary path:** If the model references a `Credential` record, `model.get_credential_obj()` returns the stored credential, which converts to Esperanto configuration via `credential.to_esperanto_config()`.

**Fallback path:** When credentials are missing or invalid, `ModelManager` invokes `provision_provider_keys(model.provider)` from [`open_notebook/ai/key_provider.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/key_provider.py). This function populates `os.environ` with API keys stored in the database, allowing provider libraries to authenticate through standard environment variables.

### Step 3: Configuration Merging

Any runtime parameters passed to `get_model()`—such as `temperature`, `max_tokens`, or `model_name`—merge into the configuration dictionary, overriding stored defaults while preserving provider-specific settings.

### Step 4: Provider Name Normalization

Database entries store provider names with underscores (e.g., `openai_azure`), but Esperanto expects hyphenated formats. The manager applies `model.provider.replace("_", "-")` to ensure compatibility before factory invocation.

### Step 5: Factory Instantiation

Depending on the validated `model.type`, `ModelManager` delegates to the appropriate `AIFactory` method in Esperanto:

- **Language models:** `AIFactory.create_language()` returns a `LanguageModel` instance.
- **Embedding models:** `AIFactory.create_embedding()` yields an `EmbeddingModel`.
- **Speech-to-text:** `AIFactory.create_speech_to_text()` produces a `SpeechToTextModel`.
- **Text-to-speech:** `AIFactory.create_text_to_speech()` generates a `TextToSpeechModel`.

The factory caches underlying provider clients, making subsequent instantiations lightweight.

## Helper Methods for Default Models

Beyond explicit model IDs, `ModelManager` provides convenience accessors that resolve default configurations from the `DefaultModels` record. These methods include:

- `get_embedding_model()` – Returns the system default embedding model.
- `get_speech_to_text()` – Retrieves the configured STT model.
- `get_text_to_speech()` – Fetches the TTS model.
- `get_default_model()` – Loads default language models for chat or completion tasks.

Each helper reads the corresponding `default_*_model` field from the database and delegates to `get_model()` with appropriate type checking.

## Practical Implementation Examples

### Provisioning a Specific Language Model

```python
from open_notebook.ai.models import ModelManager

async def load_gpt4():
    manager = ModelManager()
    # Load by SurrealDB ID with runtime overrides

    llm = await manager.get_model(
        "model:openai:gpt-4",
        temperature=0.7,
        max_tokens=1024,
    )
    # llm is a LanguageModel ready for await llm.generate(prompt)

    return llm

```

### Using Default Embedding Models

```python
from open_notebook.ai.models import ModelManager

async def get_embedding():
    manager = ModelManager()
    embedding = await manager.get_embedding_model()
    # embedding is an Esperanto EmbeddingModel ready for .embed() calls

    return embedding

```

### Fallback Credential Provisioning

```python
from open_notebook.ai.models import ModelManager

async def use_openai_chat():
    manager = ModelManager()
    # Automatically provisions env vars if no credential record exists

    chat = await manager.get_default_model("chat")
    return chat

```

### Speech-to-Text Transcription

```python
from open_notebook.ai.models import ModelManager

async def transcribe(audio_bytes):
    manager = ModelManager()
    stt = await manager.get_speech_to_text()
    text = await stt.transcribe(audio_bytes)
    return text

```

## Summary

- **ModelManager** in [`open_notebook/ai/models.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/models.py) orchestrates the transformation of database model records into executable AI objects.
- The provisioning flow validates model types, resolves credentials via `get_credential_obj()` or falls back to `provision_provider_keys()` in [`open_notebook/ai/key_provider.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/key_provider.py).
- Provider names undergo normalization (underscores to hyphens) before Esperanto factory invocation.
- **AIFactory** methods create concrete instances for language, embedding, speech-to-text, and text-to-speech models with client caching for performance.
- Convenience methods abstract default model resolution from the `DefaultModels` configuration.

## Frequently Asked Questions

### What happens if a model record lacks a credential?

When `model.get_credential_obj()` returns None or fails, `ModelManager` automatically calls `provision_provider_keys(model.provider)` from [`open_notebook/ai/key_provider.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/key_provider.py). This function injects API keys from the database into `os.environ`, allowing the underlying provider library to authenticate via standard environment variable lookups.

### How does ModelManager handle provider name formatting differences?

Database records store provider names using underscores (such as `openai_azure`), while the Esperanto library expects hyphenated formats (`openai-azure`). During provisioning, `ModelManager` applies `model.provider.replace("_", "-")` to normalize the string before passing it to `AIFactory` methods, ensuring compatibility across storage and runtime conventions.

### Can I override model parameters when requesting an instance?

Yes. The `get_model()` method accepts arbitrary keyword arguments that merge into the configuration dictionary. You can pass runtime-specific values like `temperature`, `max_tokens`, or `model_name` to override stored defaults without modifying the database record, enabling dynamic behavior adjustments per invocation.

### What is the relationship between ModelManager and Esperanto?

ModelManager serves as the Open-Notebook-specific abstraction layer that handles database retrieval, credential resolution, and configuration preparation. It delegates actual model construction to **Esperanto's** `AIFactory`, which manages provider-specific client initialization and caching. This separation allows Open-Notebook to maintain database-first configuration while leveraging Esperanto's unified interface for language, embedding, and speech operations.