# How the `token_count()` Function Estimates Token Usage and Detects Large Contexts in Open Notebook

> Learn how Open Notebook's token_count() function estimates token usage and detects large contexts using tiktoken or a word count fallback, automatically selecting large-context models for over 105k tokens.

- Repository: [Luis Novo/open-notebook](https://github.com/lfnovo/open-notebook)
- Tags: how-to-guide
- Published: 2026-06-06

---

**The `token_count()` function in Open Notebook estimates token consumption by first attempting exact tokenization via tiktoken's "o200k_base" encoder, falling back to a word-count heuristic multiplied by 1.3 when unavailable, and automatically triggers large-context model selection when content exceeds 105,000 tokens.**

Open Notebook relies on precise token estimation to manage language model costs and prevent context window overflow. The `token_count()` function serves as the central utility for this measurement across the entire codebase, providing deterministic token counts that drive everything from model selection to text chunking strategies.

## Implementation Details of token_count()

The core implementation resides in **[`open_notebook/utils/token_utils.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/token_utils.py)**. This utility provides a robust two-tier approach to token estimation.

### Exact Tokenization with tiktoken

When available, the function imports **tiktoken** and instantiates the *"o200k_base"* encoding. This provides an exact token count for the input string, matching the tokenization behavior of modern OpenAI models. The deterministic output ensures consistent measurements across repeated calls.

### Fallback Heuristic for Offline Environments

If tiktoken cannot be imported—due to missing dependencies or offline environments—the function degrades gracefully to a **word-count estimate multiplied by 1.3**. This heuristic statistically approximates English token density, providing reasonable estimates when exact tokenizers are unavailable.

### Cost Estimation Integration

The same module exposes **`token_cost()`**, which converts the output of `token_count()` into estimated monetary costs. This allows the system to log and budget API expenses based on actual token consumption rather than rough character counts.

## Large-Context Detection Logic

The primary production use of `token_count()` occurs in **[`open_notebook/ai/provision.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/provision.py)**. Here, the function drives intelligent model routing based on content size.

When `provision_langchain_model()` receives content for processing, it immediately invokes `token_count(content)` at line 19. If the returned value exceeds **105,000 tokens**, the system logs a warning at line 26 and automatically selects the large-context model via `model_manager.get_default_model("large_context")`. This threshold prevents truncation errors by ensuring the system switches to models capable of handling extended prompts before processing begins.

## Token Counting Across the Codebase

Beyond model provisioning, `token_count()` maintains consistent size measurements throughout the data pipeline.

### Text Chunking

In **[`open_notebook/utils/chunking.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/chunking.py)**, the chunker repeatedly queries `token_count()` to split documents into pieces that respect configured token budgets such as `CHUNK_SIZE`. This ensures no chunk exceeds the embedding model's input limits.

### Embedding Pipeline

The embedding utility in **[`open_notebook/utils/embedding.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/embedding.py)** records token sizes for each text chunk before vectorization. These measurements inform decisions about whether to embed content directly or require pre-splitting, optimizing both API usage and storage.

### Context Building

**[`open_notebook/utils/context_builder.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/context_builder.py)** employs lazy evaluation, computing token counts only when not explicitly provided. This enables downstream components to perform size checks without redundant re-tokenization of previously processed documents.

## Practical Code Examples

The following examples demonstrate common usage patterns for the `token_count()` function.

Direct token counting:

```python
from open_notebook.utils.token_utils import token_count

text = "Hello, world! This is a short test."
print(token_count(text))   # → exact token count (e.g., 9)

```

Large-context model selection:

```python
from open_notebook.ai.provision import provision_langchain_model

async def get_model_for_content(content):
    # Automatically selects large-context model if content > 105k tokens

    return await provision_langchain_model(
        content,
        model_id=None,
        default_type="chat",
    )

```

Chunking by token budget:

```python
from open_notebook.utils.chunking import chunk_text
from open_notebook.utils.token_utils import token_count

MAX_TOKENS = 2_000
chunks = chunk_text(long_document, max_tokens=MAX_TOKENS)

# Verify all chunks respect the limit

assert all(token_count(c) <= MAX_TOKENS for c in chunks)

```

## Summary

- The `token_count()` function resides in [`open_notebook/utils/token_utils.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/token_utils.py) and serves as the single source of truth for token estimation across the Open Notebook codebase.
- It prioritizes exact counts via tiktoken's "o200k_base" encoding, falling back to a word-count heuristic (×1.3) when the library is unavailable.
- Large-context detection occurs in [`open_notebook/ai/provision.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/provision.py), where content exceeding 105,000 tokens triggers automatic selection of a high-capacity model.
- The utility supports downstream operations including text chunking, embedding size validation, and cost estimation through `token_cost()`.

## Frequently Asked Questions

### What encoding does the token_count() function use?

The function attempts to use **tiktoken's "o200k_base" encoding** first, which corresponds to modern OpenAI model tokenization schemes. If tiktoken is not installed, it falls back to a statistical estimate based on word count.

### What happens if tiktoken is not installed?

When tiktoken is unavailable, the function calculates a **word count and multiplies by 1.3** to approximate token density. This heuristic provides reasonable estimates for English text without requiring external dependencies.

### Why is the large-context threshold set to 105,000 tokens?

The **105,000 token threshold** in [`open_notebook/ai/provision.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/provision.py) acts as a safety buffer below common large-context model limits (such as 128k or 200k contexts). This ensures adequate headroom for system prompts and response generation while preventing truncation errors.

### How accurate is the fallback word-count heuristic?

The **1.3× multiplier** statistically approximates the average ratio of tokens to words in English prose. While less precise than tiktoken for code or non-English text, it provides sufficient accuracy for chunking decisions and rough cost estimation when exact tokenizers are unavailable.