# How ModelManager Selects AI Models with Large Context Windows (>105K Tokens) in Open Notebook

> Learn how Open Notebook automatically selects AI models with large context windows over 105K tokens. Discover how ModelManager routes requests efficiently.

- Repository: [Luis Novo/open-notebook](https://github.com/lfnovo/open-notebook)
- Tags: deep-dive
- Published: 2026-06-06

---

**Open Notebook automatically detects requests exceeding 105,000 tokens and routes them to a configured large-context AI model by checking the token count in `provision_langchain_model` before querying `ModelManager` for the specialized default.**

In the `lfnovo/open-notebook` repository, the system handles oversized context windows through a token-aware provisioning pipeline that dynamically selects appropriate models. When `ModelManager` selects AI models for large-context requests, it relies on a fixed threshold defined in the provisioning layer rather than hard-coded model names, allowing administrators to configure any compatible model without modifying application code.

## Token Detection: The 105,000 Token Threshold

The detection logic begins in [`open_notebook/ai/provision.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/provision.py), where the `provision_langchain_model` function measures incoming content before selecting an AI model. The system uses the `token_count` utility from [`open_notebook/utils/token_utils.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/token_utils.py) to quantify the request size.

When the token count exceeds **105,000**, the function immediately switches to the large-context selection path:

```python

# open_notebook/ai/provision.py (lines 19-30)

tokens = token_count(content)          # count tokens in the request

if tokens > 105_000:                   # > 105k tokens → large-context path

    selection_reason = f"large_context (content has {tokens} tokens)"
    model = await model_manager.get_default_model("large_context", **kwargs)
elif model_id:                         # explicit model requested by the user

    ...
else:                                  # fall back to the normal default for the type

    ...

```

This threshold is fixed at 105,000 tokens and serves as the primary trigger for large-context model selection.

## ModelManager Selection Logic

When the token threshold is breached, `ModelManager` retrieves the appropriate model ID from the database configuration. The selection logic resides in [`open_notebook/ai/models.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/models.py), specifically within the `get_default_model` method.

The `DefaultModels` dataclass stores the large-context model identifier:

```python

# open_notebook/ai/models.py (line 66)

large_context_model: Optional[str] = None

```

During runtime, `ModelManager` handles the `"large_context"` model type by reading this field and constructing the corresponding Esperanto language model:

```python

# open_notebook/ai/models.py (lines 246-250)

elif model_type == "large_context":
    model_id = defaults.large_context_model
...
return await self.get_model(model_id, **kwargs)

```

The actual model instantiation is delegated to `get_model`, which creates a cached Esperanto LLM instance ready for LangChain integration.

## Error Handling and Fallback Behavior

If no large-context model has been configured in `DefaultModels`, the system logs a warning and ultimately raises a `ConfigurationError`. This prevents the application from attempting to process oversized content with models that lack sufficient context windows. The error originates in `provision_langchain_model` when it cannot retrieve a valid model for the large-context type.

## Configuring the Large-Context Model

Administrators define the large-context model through the database-backed `DefaultModels` configuration. Unlike hard-coded selections, this approach allows any model supporting extended context windows to serve as the handler without code changes.

You can configure the large-context default via the REST API exposed in [`api/routers/models.py`](https://github.com/lfnovo/open-notebook/blob/main/api/routers/models.py):

```bash
curl -X PATCH http://localhost:5055/models/defaults \
  -H "Content-Type: application/json" \
  -d '{"large_context_model": "model-xyz"}'

```

After configuration, any request exceeding 105,000 tokens automatically routes to `"model-xyz"`.

## Practical Implementation Examples

### Example 1: Manual Provisioning Calls

To leverage the automatic selection in your own code, call `provision_langchain_model` with content and let the token detection handle model selection:

```python
from open_notebook.ai.provision import provision_langchain_model

# Suppose we have a long article stored in `text`

model = await provision_langchain_model(
    content=text,
    model_id=None,          # no explicit model requested

    default_type="chat",   # normal fallback would be the chat default

    temperature=0.2,
)

# `model` is a LangChain-compatible LLM ready for use

```

### Example 2: Direct Access to Large-Context Defaults

For cases where you need to verify or manually retrieve the large-context configuration:

```python
from open_notebook.ai.models import model_manager

large_ctx = await model_manager.get_default_model("large_context")

# `large_ctx` is a LanguageModel (or None if not configured)

```

## Summary

- **Token Threshold**: Open Notebook uses a fixed **105,000-token** threshold in `provision_langchain_model` to trigger large-context routing.
- **Dynamic Selection**: `ModelManager` selects AI models by reading the `large_context_model` field from `DefaultModels`, avoiding hard-coded model names.
- **Configurable**: Administrators set the large-context model ID via the API, allowing any high-capacity model to handle oversized requests.
- **Safe Fallbacks**: The system raises `ConfigurationError` when large-context content encounters an unconfigured model state.

## Frequently Asked Questions

### What happens if no large-context model is configured?

If `DefaultModels.large_context_model` is `None` when a request exceeds 105,000 tokens, the system logs a warning and raises a `ConfigurationError`. This prevents processing failures that would occur if a standard-context model attempted to handle the oversized content.

### Can I change the 105,000 token threshold?

According to the source code in [`open_notebook/ai/provision.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/provision.py), the threshold is hard-coded at **105_000** tokens. To modify this limit, you would need to edit the source code in the provisioning layer, as there is no runtime configuration parameter exposed for this value.

### How does ModelManager cache the selected models?

`ModelManager.get_model` creates and caches Esperanto language model instances internally. Once retrieved from `DefaultModels.large_context_model` and instantiated, the model remains available for subsequent requests, reducing initialization overhead for repeated large-context operations.

### Where is the token counting utility implemented?

The `token_count` function resides in [`open_notebook/utils/token_utils.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/token_utils.py). This utility converts raw content strings into token quantities using the appropriate tokenizer for the target model family, ensuring accurate measurements before the 105,000-token comparison occurs.