# How the Source Chat Workflow Enables Context-Aware Conversations in Open-Notebook

> Learn how Open Notebook’s source chat workflow uses LangGraph to build context-aware conversations, injecting source content directly into LLM prompts for better insights.

- Repository: [Luis Novo/open-notebook](https://github.com/lfnovo/open-notebook)
- Tags: how-to-guide
- Published: 2026-06-10

---

**The source chat workflow in Open-Notebook creates isolated conversational sessions bound to specific sources through a LangGraph-powered state machine, using async context builders to inject source content and insights directly into LLM prompts.**

Open-Notebook's **source-chat workflow** transforms static documents into interactive knowledge bases. By linking chat sessions directly to individual sources—whether PDFs, web pages, or videos—the system grounds AI responses in verified content rather than hallucinated knowledge. This architecture, implemented across the `lfnovo/open-notebook` repository, combines graph-based state management with async context retrieval to deliver precise, source-grounded conversations.

## Architecture Overview

The workflow operates through three tightly coupled layers that isolate concerns while maintaining state consistency:

| Layer | Responsibility | Key Files |
|-------|---------------|-----------|
| **API Layer** | HTTP endpoints for session management and message streaming | [`api/routers/source_chat.py`](https://github.com/lfnovo/open-notebook/blob/main/api/routers/source_chat.py) |
| **Graph Layer** | Conversational state, context assembly, and LLM invocation | [`open_notebook/graphs/source_chat.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/graphs/source_chat.py) |
| **Utility Layer** | Source retrieval, formatting, and model provisioning | [`open_notebook/utils/context_builder.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/context_builder.py), [`open_notebook/ai/provision.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/provision.py) |

This separation ensures that business logic remains testable while the graph layer handles complex state transitions.

## Creating Isolated Chat Sessions

Every source-specific conversation begins with a POST request to `/sources/{source_id}/chat/sessions`. The endpoint creates a `ChatSession` record and establishes a **refers_to** relationship to the target `Source` in the database:

```python

# From api/routers/source_chat.py (lines 94-114)

# Creates session with refers_to edge linking to source

session = ChatSession.create(
    source_id=source_id,
    user_id=current_user.id
)
session.add_relationship("refers_to", source)

```

The session ID serves as the **thread ID** for LangGraph's checkpointing system. This design guarantees that conversation history, context windows, and model state remain isolated per source-chat session. Even if multiple users chat with the same source simultaneously, their threads never collide.

## Building Source-Specific Context

When a user sends a message, the graph node `call_model_with_source_context` orchestrates context retrieval. The `ContextBuilder` utility fetches the source's full text, extracted insights, and metadata:

```python

# From open_notebook/graphs/source_chat.py (lines 66-73)

context_builder = ContextBuilder(
    source_id=state["source_id"],
    include_insights=True,
    include_metadata=True
)
context = await context_builder.build()

```

Notably, the context building runs in a **new asyncio event loop** to isolate database queries from the synchronous LangGraph execution. This prevents deadlock scenarios where the graph thread blocks waiting for async I/O. The returned dictionary contains:

- `sources`: The raw source content and identifiers
- `insights`: Extracted knowledge chunks from preprocessing
- `metadata`: Source attributes like title, author, and ingestion date
- `token_usage`: Running count for context window management

## Preparing the Prompt Template

Raw context data transforms into LLM-ready prompts through `_format_source_context` (lines 90-140 in [`source_chat.py`](https://github.com/lfnovo/open-notebook/blob/main/source_chat.py)). The formatter constructs a structured string containing source excerpts and insight summaries.

The `Prompter` class then renders the Jinja2 template `source_chat/system`:

```python

# Template variables passed to Prompter

{
    "source": source_model,
    "insights": [insight_models],
    "context": formatted_source_text,
    "context_indicators": {
        "sources": [...],
        "insights": [...],
        "notes": []
    }
}

```

This template approach allows customization of system prompts without code changes. Administrators can adjust how source content appears to the model—whether as direct quotes, summaries, or structured JSON—by modifying the template configuration.

## Model Provisioning and Invocation

The workflow provisions models through `provision_langchain_model` in [`open_notebook/ai/provision.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/ai/provision.py). This function selects the appropriate **Esperanto** provider (OpenAI, Anthropic, or local models) based on:

1. Session-specific model overrides
2. User preference settings
3. Default system configuration

```python

# From source_chat.py (lines 33-45)

model = provision_langchain_model(
    model_id=session.config.get("model"),
    temperature=0.7
)

```

Similar to context building, model invocation occurs in a **dedicated event loop** to avoid blocking the graph's main thread. The provisioned `ChatModel` receives the combined system prompt and conversation history, returning an `AIMessage` containing the assistant's response.

## Response Processing and Streaming

Before delivery, responses undergo cleaning:

- `extract_text_content` strips Markdown and formatting artifacts
- `clean_thinking_content` removes internal reasoning markers (e.g., `<thinking>` tags)
- Token usage metrics update the session's running counters

The API layer streams cleaned chunks to the client via Server-Sent Events, providing real-time feedback while maintaining the graph's transactional integrity.

## Summary

The source chat workflow in Open-Notebook achieves context-aware conversations through these architectural decisions:

- **Session isolation** via `refers_to` relationships and LangGraph thread IDs
- **Async context building** that prevents I/O blocking in the graph layer
- **Template-driven prompting** allowing customizable source presentation
- **Provider-agnostic model provisioning** supporting multiple LLM backends
- **Dedicated event loops** for database and model operations to eliminate deadlocks

This design scales from single-document Q&A to complex multi-source research workflows while maintaining deterministic state management.

## Frequently Asked Questions

### How does the source chat workflow maintain conversation history?

Each chat session receives a unique thread ID derived from the database `ChatSession` record. LangGraph's checkpointing system (as implemented in [`open_notebook/graphs/source_chat.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/graphs/source_chat.py)) persists the state graph between messages, including the message buffer and context indicators. This allows the conversation to reference previous exchanges while remaining bound to the original source context.

### What source types support the chat workflow?

The workflow accepts any source ingested into Open-Notebook's knowledge base. The `ContextBuilder` in [`open_notebook/utils/context_builder.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/utils/context_builder.py) handles PDFs, web pages, YouTube transcripts, and plain text uniformly. The system extracts raw text and insights during ingestion, then retrieves these structures during chat context assembly. Video sources utilize transcript data rather than visual analysis.

### Can I use different AI models for different source chats?

Yes. The `provision_langchain_model` function checks for a `model` key in the session configuration before falling back to system defaults. Users can override the model per-session through the API, enabling lightweight models for simple queries and larger models for complex analysis. All supported providers (OpenAI, Anthropic, local LLMs) work through the unified Esperanto interface.

### Why does the workflow use separate event loops for database calls?

The synchronous LangGraph execution engine risks deadlock when awaiting async database operations. By spawning new event loops for context building (`context_builder.build()`) and model invocation, the workflow isolates blocking I/O from the graph's main thread. This pattern, visible in lines 66-73 and 33-45 of [`source_chat.py`](https://github.com/lfnovo/open-notebook/blob/main/source_chat.py), ensures responsive state management regardless of database latency or model response times.