How ContextBuilder Assembles RAG Context for Chat Interactions in Open Notebook

ContextBuilder is the core RAG assembler in lfnovo/open-notebook that loads sources, notes, and insights, removes duplicates, sorts them by priority, trims them to a token budget, and returns a formatted dictionary ready for LLM prompting.

The ContextBuilder utility in the lfnovo/open-notebook repository is responsible for assembling Retrieval Augmented Generation (RAG) context for chat interactions. Implemented in open_notebook/utils/context_builder.py, it provides a deterministic, configurable workflow that gathers content from notebooks, sources, and notes. Understanding its internal mechanism helps developers optimize token usage and improve response quality when building chat features on top of this open-source knowledge base.

Initialization and Configuration

At lines 65–99, the ContextBuilder constructor stores all user-supplied keyword arguments—such as source_id, notebook_id, and include_insights—inside self.params. It also instantiates a ContextConfig object that supplies defaults like priority_weights and max_tokens. This configuration object governs every downstream decision about which content to pull and how aggressively to trim it.

Loading and Enriching Items During RAG Context Assembly

The assembly process begins when await builder.build() is invoked at lines 105–112. This method clears stale items and delegates to a series of specialized loaders that transform raw domain records into ContextItem objects. The loaders rely on methods defined in open_notebook/domain/source.py, open_notebook/domain/notebook.py, and open_notebook/domain/note.py—such as get_context and get_insights—to supply raw content.

Adding Source Context

_add_source_context (lines 142–184) loads the Source record and determines whether to fetch a short or long context based on the inclusion level specified in ContextConfig. It creates a ContextItem for the source itself and, when include_insights is True, creates additional ContextItem instances for each attached insight.

Adding Notebook Context

_add_notebook_context (lines 210–250) loads the Notebook record, then iterates over the configured sources and notes defined in self.context_config.sources and self.context_config.notes. If no explicit configuration is provided, it falls back to automatically including all sources with insights and all notes with short content. Each retrieved entity is forwarded to the appropriate helper method for conversion into a ContextItem.

Adding Note Context

Analogous to sources, _add_note_context (lines 254–284) fetches the Note object, selects a short or long representation, and builds a ContextItem. The resulting item is then stored in the builder's internal collection for downstream processing.

Custom Parameter Hooks

For extensibility, _process_custom_params (lines 296–303) provides an override hook that subclasses can use to support extra keyword arguments. This allows developers to inject custom logic—such as custom_filters—without modifying the core builder implementation.

Deduplication, Prioritization, and Truncation for Chat RAG

Once all items are loaded, the pipeline moves to three critical quality-control stages.

Removing Duplicates

remove_duplicates (lines 351–363) collapses items that share the same ID. This ensures that a source or note referenced multiple times—directly and through a notebook—is only sent to the LLM once.

Priority Sorting

prioritize (lines 315–319) sorts the item list by each ContextItem's priority value. Default weights are source = 100, insight = 75, and note = 50. Higher-priority items remain in context longer when the token budget is tight.

Enforcing the Token Budget

truncate_to_fit (lines 321–349) removes the lowest-priority items until the total token count fits under the max_tokens limit. The pipeline knows the exact size of each chunk because the ContextItem dataclass (lines 21–30) automatically computes token_count on instantiation via open_notebook.utils.token_utils.token_count. This guarantees accurate, deterministic trimming before the context reaches the model.

Formatting the Final Response

_format_response (lines 367–416) groups the surviving items by type, aggregates token counts, and appends metadata such as the number of sources, notes, insights, and active config flags. The result is a plain dictionary matching the ContextResponse schema defined in api/models.py and used by FastAPI. This structure makes it trivial to attach to a prompt template or return directly to a client.

Top-Level Entry Points for RAG Assembly

While developers can instantiate ContextBuilder directly, the module exposes three high-level async helpers at lines 420–595. These functions wrap the class into one-shot async calls for common chat workflows.

  • build_notebook_context(notebook_id, ...) – Assembles a full notebook-wide RAG payload.
  • build_source_context(source_id, ...) – Builds context for a single source with optional insights.
  • build_mixed_context(...) – Lets callers combine arbitrary source and note IDs under a custom ContextConfig.

Each helper instantiates ContextBuilder with the appropriate parameters and invokes await builder.build().

Practical Code Examples

Notebook-Wide Context with a Token Ceiling

from open_notebook.utils.context_builder import build_notebook_context

# Assemble RAG context for notebook "notebook:1234" with a 4,000-token ceiling

context = await build_notebook_context(
    notebook_id="notebook:1234",
    max_tokens=4000,
)

# `context` is a dict:

# {

#   "sources": [{...}, ...],

#   "notes":   [{...}, ...],

#   "insights": [{...}, ...],

#   "total_tokens": 3892,

#   "metadata": {...},

#   "notebook_id": "notebook:1234"

# }

This call triggers the complete flow described above, from loading through truncation.

Source-Only Context Including Insights

from open_notebook.utils.context_builder import build_source_context

ctx = await build_source_context(
    source_id="source:abcde",
    include_insights=True,      # pull insights as separate items

    max_tokens=2000,
)

# `ctx["insights"]` now holds a list of insight dicts, each with its own token count.

Mixed Custom Context from Selective IDs

from open_notebook.utils.context_builder import build_mixed_context

ctx = await build_mixed_context(
    source_ids=["source:1", "source:2"],
    note_ids=["note:7"],
    max_tokens=3000,
)

# The builder respects the custom `ContextConfig` (sources → insights only,

# notes → full content) and will drop the lowest-priority items if the token budget is exceeded.

Calling the FastAPI Context Endpoint

POST /api/context/notebooks/{notebook_id}
Content-Type: application/json

{
  "context_config": {
    "sources": { "source:1": "insights", "source:2": "full content" },
    "notes":   { "note:5": "full content" },
    "include_insights": true,
    "max_tokens": 3500
  }
}

The router implementation in api/routers/context.py (lines 12–115) parses this request, forwards the IDs to ContextBuilder, and returns a ContextResponse model that mirrors the dictionary produced by _format_response.

Summary

  • ContextBuilder lives in open_notebook/utils/context_builder.py and orchestrates a deterministic, multi-stage RAG pipeline.
  • It loads Source, Notebook, and Note records via _add_source_context, _add_notebook_context, and _add_note_context.
  • The deduplication pass prevents redundant content from reaching the LLM.
  • Priority weighting (sources > insights > notes) ensures the most relevant information survives token truncation.
  • ContextItem automatically calculates token counts via open_notebook.utils.token_utils.token_count, enabling precise budget enforcement in truncate_to_fit.
  • High-level helpers—build_notebook_context, build_source_context, and build_mixed_context—offer convenient async entry points for chat services.

Frequently Asked Questions

How does ContextBuilder decide between short and long content?

_add_source_context and _add_note_context inspect the inclusion level defined in ContextConfig for each specific source or note. If the config requests "short content", the builder loads the abbreviated representation; if it requests "full content" or "insights", it pulls the corresponding larger payload. When no explicit config exists, the notebook loader falls back to short content for notes and insights for sources.

What prevents the same source from being counted twice?

The remove_duplicates method (lines 351–363) collapses items that share the same unique ID. This is especially important when a source is referenced both directly by ID and indirectly through its parent notebook. The deduplication pass guarantees that the LLM prompt contains only one copy of each document.

How is the token limit enforced accurately?

Each ContextItem computes its exact token count at instantiation by calling token_count from open_notebook.utils.token_utils. After sorting by priority, truncate_to_fit (lines 321–349) iteratively drops the lowest-priority items until the running total is less than or equal to max_tokens. This design eliminates guesswork and keeps the context safely within the model's window.

Can I extend ContextBuilder with custom parameters?

Yes. The _process_custom_params hook (lines 296–303) is designed for subclassing. Developers can override this method to consume additional keyword arguments—such as custom_filters—without changing the base builder implementation, making the RAG pipeline extensible for specialized chat workflows.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →