How the `token_count()` Function Estimates Token Usage and Detects Large Contexts in Open Notebook

The token_count() function in Open Notebook estimates token consumption by first attempting exact tokenization via tiktoken's "o200k_base" encoder, falling back to a word-count heuristic multiplied by 1.3 when unavailable, and automatically triggers large-context model selection when content exceeds 105,000 tokens.

Open Notebook relies on precise token estimation to manage language model costs and prevent context window overflow. The token_count() function serves as the central utility for this measurement across the entire codebase, providing deterministic token counts that drive everything from model selection to text chunking strategies.

Implementation Details of token_count()

The core implementation resides in open_notebook/utils/token_utils.py. This utility provides a robust two-tier approach to token estimation.

Exact Tokenization with tiktoken

When available, the function imports tiktoken and instantiates the "o200k_base" encoding. This provides an exact token count for the input string, matching the tokenization behavior of modern OpenAI models. The deterministic output ensures consistent measurements across repeated calls.

Fallback Heuristic for Offline Environments

If tiktoken cannot be imported—due to missing dependencies or offline environments—the function degrades gracefully to a word-count estimate multiplied by 1.3. This heuristic statistically approximates English token density, providing reasonable estimates when exact tokenizers are unavailable.

Cost Estimation Integration

The same module exposes token_cost(), which converts the output of token_count() into estimated monetary costs. This allows the system to log and budget API expenses based on actual token consumption rather than rough character counts.

Large-Context Detection Logic

The primary production use of token_count() occurs in open_notebook/ai/provision.py. Here, the function drives intelligent model routing based on content size.

When provision_langchain_model() receives content for processing, it immediately invokes token_count(content) at line 19. If the returned value exceeds 105,000 tokens, the system logs a warning at line 26 and automatically selects the large-context model via model_manager.get_default_model("large_context"). This threshold prevents truncation errors by ensuring the system switches to models capable of handling extended prompts before processing begins.

Token Counting Across the Codebase

Beyond model provisioning, token_count() maintains consistent size measurements throughout the data pipeline.

Text Chunking

In open_notebook/utils/chunking.py, the chunker repeatedly queries token_count() to split documents into pieces that respect configured token budgets such as CHUNK_SIZE. This ensures no chunk exceeds the embedding model's input limits.

Embedding Pipeline

The embedding utility in open_notebook/utils/embedding.py records token sizes for each text chunk before vectorization. These measurements inform decisions about whether to embed content directly or require pre-splitting, optimizing both API usage and storage.

Context Building

open_notebook/utils/context_builder.py employs lazy evaluation, computing token counts only when not explicitly provided. This enables downstream components to perform size checks without redundant re-tokenization of previously processed documents.

Practical Code Examples

The following examples demonstrate common usage patterns for the token_count() function.

Direct token counting:

from open_notebook.utils.token_utils import token_count

text = "Hello, world! This is a short test."
print(token_count(text))   # → exact token count (e.g., 9)

Large-context model selection:

from open_notebook.ai.provision import provision_langchain_model

async def get_model_for_content(content):
    # Automatically selects large-context model if content > 105k tokens

    return await provision_langchain_model(
        content,
        model_id=None,
        default_type="chat",
    )

Chunking by token budget:

from open_notebook.utils.chunking import chunk_text
from open_notebook.utils.token_utils import token_count

MAX_TOKENS = 2_000
chunks = chunk_text(long_document, max_tokens=MAX_TOKENS)

# Verify all chunks respect the limit

assert all(token_count(c) <= MAX_TOKENS for c in chunks)

Summary

  • The token_count() function resides in open_notebook/utils/token_utils.py and serves as the single source of truth for token estimation across the Open Notebook codebase.
  • It prioritizes exact counts via tiktoken's "o200k_base" encoding, falling back to a word-count heuristic (×1.3) when the library is unavailable.
  • Large-context detection occurs in open_notebook/ai/provision.py, where content exceeding 105,000 tokens triggers automatic selection of a high-capacity model.
  • The utility supports downstream operations including text chunking, embedding size validation, and cost estimation through token_cost().

Frequently Asked Questions

What encoding does the token_count() function use?

The function attempts to use tiktoken's "o200k_base" encoding first, which corresponds to modern OpenAI model tokenization schemes. If tiktoken is not installed, it falls back to a statistical estimate based on word count.

What happens if tiktoken is not installed?

When tiktoken is unavailable, the function calculates a word count and multiplies by 1.3 to approximate token density. This heuristic provides reasonable estimates for English text without requiring external dependencies.

Why is the large-context threshold set to 105,000 tokens?

The 105,000 token threshold in open_notebook/ai/provision.py acts as a safety buffer below common large-context model limits (such as 128k or 200k contexts). This ensures adequate headroom for system prompts and response generation while preventing truncation errors.

How accurate is the fallback word-count heuristic?

The 1.3× multiplier statistically approximates the average ratio of tokens to words in English prose. While less precise than tiktoken for code or non-English text, it provides sufficient accuracy for chunking decisions and rough cost estimation when exact tokenizers are unavailable.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →