# How ContextBuilder Assembles RAG Context for Chat Interactions in Open Notebook

> Discover how ContextBuilder in lfnovo/open-notebook efficiently assembles RAG context. Learn its step-by-step process for loading, deduplicating, sorting, and trimming sources for LLM prompts.

- Repository: [Luis Novo/open-notebook](https://github.com/lfnovo/open-notebook)
- Tags: internals
- Published: 2026-06-06

---

**`ContextBuilder` is the core RAG assembler in `lfnovo/open-notebook` that loads sources, notes, and insights, removes duplicates, sorts them by priority, trims them to a token budget, and returns a formatted dictionary ready for LLM prompting.**

The **ContextBuilder** utility in the `lfnovo/open-notebook` repository is responsible for assembling Retrieval Augmented Generation (RAG) context for chat interactions. Implemented in [`open_notebook/utils/context_builder.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/context_builder.py), it provides a deterministic, configurable workflow that gathers content from notebooks, sources, and notes. Understanding its internal mechanism helps developers optimize token usage and improve response quality when building chat features on top of this open-source knowledge base.

## Initialization and Configuration

At lines 65–99, the **ContextBuilder** constructor stores all user-supplied keyword arguments—such as `source_id`, `notebook_id`, and `include_insights`—inside `self.params`. It also instantiates a **ContextConfig** object that supplies defaults like `priority_weights` and `max_tokens`. This configuration object governs every downstream decision about which content to pull and how aggressively to trim it.

## Loading and Enriching Items During RAG Context Assembly

The assembly process begins when `await builder.build()` is invoked at lines 105–112. This method clears stale items and delegates to a series of specialized loaders that transform raw domain records into **ContextItem** objects. The loaders rely on methods defined in [`open_notebook/domain/source.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/domain/source.py), [`open_notebook/domain/notebook.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/domain/notebook.py), and [`open_notebook/domain/note.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/domain/note.py)—such as `get_context` and `get_insights`—to supply raw content.

### Adding Source Context

`_add_source_context` (lines 142–184) loads the **Source** record and determines whether to fetch a *short* or *long* context based on the inclusion level specified in `ContextConfig`. It creates a `ContextItem` for the source itself and, when `include_insights` is `True`, creates additional `ContextItem` instances for each attached insight.

### Adding Notebook Context

`_add_notebook_context` (lines 210–250) loads the **Notebook** record, then iterates over the configured sources and notes defined in `self.context_config.sources` and `self.context_config.notes`. If no explicit configuration is provided, it falls back to automatically including all sources with insights and all notes with short content. Each retrieved entity is forwarded to the appropriate helper method for conversion into a `ContextItem`.

### Adding Note Context

Analogous to sources, `_add_note_context` (lines 254–284) fetches the **Note** object, selects a short or long representation, and builds a `ContextItem`. The resulting item is then stored in the builder's internal collection for downstream processing.

### Custom Parameter Hooks

For extensibility, `_process_custom_params` (lines 296–303) provides an override hook that subclasses can use to support extra keyword arguments. This allows developers to inject custom logic—such as `custom_filters`—without modifying the core builder implementation.

## Deduplication, Prioritization, and Truncation for Chat RAG

Once all items are loaded, the pipeline moves to three critical quality-control stages.

### Removing Duplicates

`remove_duplicates` (lines 351–363) collapses items that share the same ID. This ensures that a source or note referenced multiple times—directly and through a notebook—is only sent to the LLM once.

### Priority Sorting

`prioritize` (lines 315–319) sorts the item list by each `ContextItem`'s `priority` value. Default weights are **source = 100**, **insight = 75**, and **note = 50**. Higher-priority items remain in context longer when the token budget is tight.

### Enforcing the Token Budget

`truncate_to_fit` (lines 321–349) removes the lowest-priority items until the total token count fits under the `max_tokens` limit. The pipeline knows the exact size of each chunk because the `ContextItem` dataclass (lines 21–30) automatically computes `token_count` on instantiation via `open_notebook.utils.token_utils.token_count`. This guarantees accurate, deterministic trimming before the context reaches the model.

## Formatting the Final Response

`_format_response` (lines 367–416) groups the surviving items by type, aggregates token counts, and appends metadata such as the number of sources, notes, insights, and active config flags. The result is a plain dictionary matching the `ContextResponse` schema defined in [`api/models.py`](https://github.com/lfnovo/open-notebook/blob/main/api/models.py) and used by FastAPI. This structure makes it trivial to attach to a prompt template or return directly to a client.

## Top-Level Entry Points for RAG Assembly

While developers can instantiate **ContextBuilder** directly, the module exposes three high-level async helpers at lines 420–595. These functions wrap the class into one-shot async calls for common chat workflows.

- `build_notebook_context(notebook_id, ...)` – Assembles a full notebook-wide RAG payload.
- `build_source_context(source_id, ...)` – Builds context for a single source with optional insights.
- `build_mixed_context(...)` – Lets callers combine arbitrary source and note IDs under a custom `ContextConfig`.

Each helper instantiates **ContextBuilder** with the appropriate parameters and invokes `await builder.build()`.

## Practical Code Examples

### Notebook-Wide Context with a Token Ceiling

```python
from open_notebook.utils.context_builder import build_notebook_context

# Assemble RAG context for notebook "notebook:1234" with a 4,000-token ceiling

context = await build_notebook_context(
    notebook_id="notebook:1234",
    max_tokens=4000,
)

# `context` is a dict:

# {

#   "sources": [{...}, ...],

#   "notes":   [{...}, ...],

#   "insights": [{...}, ...],

#   "total_tokens": 3892,

#   "metadata": {...},

#   "notebook_id": "notebook:1234"

# }

```

This call triggers the complete flow described above, from loading through truncation.

### Source-Only Context Including Insights

```python
from open_notebook.utils.context_builder import build_source_context

ctx = await build_source_context(
    source_id="source:abcde",
    include_insights=True,      # pull insights as separate items

    max_tokens=2000,
)

# `ctx["insights"]` now holds a list of insight dicts, each with its own token count.

```

### Mixed Custom Context from Selective IDs

```python
from open_notebook.utils.context_builder import build_mixed_context

ctx = await build_mixed_context(
    source_ids=["source:1", "source:2"],
    note_ids=["note:7"],
    max_tokens=3000,
)

# The builder respects the custom `ContextConfig` (sources → insights only,

# notes → full content) and will drop the lowest-priority items if the token budget is exceeded.

```

### Calling the FastAPI Context Endpoint

```http
POST /api/context/notebooks/{notebook_id}
Content-Type: application/json

{
  "context_config": {
    "sources": { "source:1": "insights", "source:2": "full content" },
    "notes":   { "note:5": "full content" },
    "include_insights": true,
    "max_tokens": 3500
  }
}

```

The router implementation in [`api/routers/context.py`](https://github.com/lfnovo/open-notebook/blob/main/api/routers/context.py) (lines 12–115) parses this request, forwards the IDs to **ContextBuilder**, and returns a `ContextResponse` model that mirrors the dictionary produced by `_format_response`.

## Summary

- `ContextBuilder` lives in [`open_notebook/utils/context_builder.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/context_builder.py) and orchestrates a deterministic, multi-stage RAG pipeline.
- It loads **Source**, **Notebook**, and **Note** records via `_add_source_context`, `_add_notebook_context`, and `_add_note_context`.
- The deduplication pass prevents redundant content from reaching the LLM.
- Priority weighting (**sources > insights > notes**) ensures the most relevant information survives token truncation.
- `ContextItem` automatically calculates token counts via `open_notebook.utils.token_utils.token_count`, enabling precise budget enforcement in `truncate_to_fit`.
- High-level helpers—`build_notebook_context`, `build_source_context`, and `build_mixed_context`—offer convenient async entry points for chat services.

## Frequently Asked Questions

### How does ContextBuilder decide between short and long content?

`_add_source_context` and `_add_note_context` inspect the inclusion level defined in `ContextConfig` for each specific source or note. If the config requests `"short content"`, the builder loads the abbreviated representation; if it requests `"full content"` or `"insights"`, it pulls the corresponding larger payload. When no explicit config exists, the notebook loader falls back to short content for notes and insights for sources.

### What prevents the same source from being counted twice?

The `remove_duplicates` method (lines 351–363) collapses items that share the same unique ID. This is especially important when a source is referenced both directly by ID and indirectly through its parent notebook. The deduplication pass guarantees that the LLM prompt contains only one copy of each document.

### How is the token limit enforced accurately?

Each `ContextItem` computes its exact token count at instantiation by calling `token_count` from `open_notebook.utils.token_utils`. After sorting by priority, `truncate_to_fit` (lines 321–349) iteratively drops the lowest-priority items until the running total is less than or equal to `max_tokens`. This design eliminates guesswork and keeps the context safely within the model's window.

### Can I extend ContextBuilder with custom parameters?

Yes. The `_process_custom_params` hook (lines 296–303) is designed for subclassing. Developers can override this method to consume additional keyword arguments—such as `custom_filters`—without changing the base builder implementation, making the RAG pipeline extensible for specialized chat workflows.