How the Source Chat Workflow Enables Context-Aware Conversations in Open-Notebook
The source chat workflow in Open-Notebook creates isolated conversational sessions bound to specific sources through a LangGraph-powered state machine, using async context builders to inject source content and insights directly into LLM prompts.
Open-Notebook's source-chat workflow transforms static documents into interactive knowledge bases. By linking chat sessions directly to individual sources—whether PDFs, web pages, or videos—the system grounds AI responses in verified content rather than hallucinated knowledge. This architecture, implemented across the lfnovo/open-notebook repository, combines graph-based state management with async context retrieval to deliver precise, source-grounded conversations.
Architecture Overview
The workflow operates through three tightly coupled layers that isolate concerns while maintaining state consistency:
| Layer | Responsibility | Key Files |
|---|---|---|
| API Layer | HTTP endpoints for session management and message streaming | api/routers/source_chat.py |
| Graph Layer | Conversational state, context assembly, and LLM invocation | open_notebook/graphs/source_chat.py |
| Utility Layer | Source retrieval, formatting, and model provisioning | open_notebook/utils/context_builder.py, open_notebook/ai/provision.py |
This separation ensures that business logic remains testable while the graph layer handles complex state transitions.
Creating Isolated Chat Sessions
Every source-specific conversation begins with a POST request to /sources/{source_id}/chat/sessions. The endpoint creates a ChatSession record and establishes a refers_to relationship to the target Source in the database:
# From api/routers/source_chat.py (lines 94-114)
# Creates session with refers_to edge linking to source
session = ChatSession.create(
source_id=source_id,
user_id=current_user.id
)
session.add_relationship("refers_to", source)
The session ID serves as the thread ID for LangGraph's checkpointing system. This design guarantees that conversation history, context windows, and model state remain isolated per source-chat session. Even if multiple users chat with the same source simultaneously, their threads never collide.
Building Source-Specific Context
When a user sends a message, the graph node call_model_with_source_context orchestrates context retrieval. The ContextBuilder utility fetches the source's full text, extracted insights, and metadata:
# From open_notebook/graphs/source_chat.py (lines 66-73)
context_builder = ContextBuilder(
source_id=state["source_id"],
include_insights=True,
include_metadata=True
)
context = await context_builder.build()
Notably, the context building runs in a new asyncio event loop to isolate database queries from the synchronous LangGraph execution. This prevents deadlock scenarios where the graph thread blocks waiting for async I/O. The returned dictionary contains:
sources: The raw source content and identifiersinsights: Extracted knowledge chunks from preprocessingmetadata: Source attributes like title, author, and ingestion datetoken_usage: Running count for context window management
Preparing the Prompt Template
Raw context data transforms into LLM-ready prompts through _format_source_context (lines 90-140 in source_chat.py). The formatter constructs a structured string containing source excerpts and insight summaries.
The Prompter class then renders the Jinja2 template source_chat/system:
# Template variables passed to Prompter
{
"source": source_model,
"insights": [insight_models],
"context": formatted_source_text,
"context_indicators": {
"sources": [...],
"insights": [...],
"notes": []
}
}
This template approach allows customization of system prompts without code changes. Administrators can adjust how source content appears to the model—whether as direct quotes, summaries, or structured JSON—by modifying the template configuration.
Model Provisioning and Invocation
The workflow provisions models through provision_langchain_model in open_notebook/ai/provision.py. This function selects the appropriate Esperanto provider (OpenAI, Anthropic, or local models) based on:
- Session-specific model overrides
- User preference settings
- Default system configuration
# From source_chat.py (lines 33-45)
model = provision_langchain_model(
model_id=session.config.get("model"),
temperature=0.7
)
Similar to context building, model invocation occurs in a dedicated event loop to avoid blocking the graph's main thread. The provisioned ChatModel receives the combined system prompt and conversation history, returning an AIMessage containing the assistant's response.
Response Processing and Streaming
Before delivery, responses undergo cleaning:
extract_text_contentstrips Markdown and formatting artifactsclean_thinking_contentremoves internal reasoning markers (e.g.,<thinking>tags)- Token usage metrics update the session's running counters
The API layer streams cleaned chunks to the client via Server-Sent Events, providing real-time feedback while maintaining the graph's transactional integrity.
Summary
The source chat workflow in Open-Notebook achieves context-aware conversations through these architectural decisions:
- Session isolation via
refers_torelationships and LangGraph thread IDs - Async context building that prevents I/O blocking in the graph layer
- Template-driven prompting allowing customizable source presentation
- Provider-agnostic model provisioning supporting multiple LLM backends
- Dedicated event loops for database and model operations to eliminate deadlocks
This design scales from single-document Q&A to complex multi-source research workflows while maintaining deterministic state management.
Frequently Asked Questions
How does the source chat workflow maintain conversation history?
Each chat session receives a unique thread ID derived from the database ChatSession record. LangGraph's checkpointing system (as implemented in open_notebook/graphs/source_chat.py) persists the state graph between messages, including the message buffer and context indicators. This allows the conversation to reference previous exchanges while remaining bound to the original source context.
What source types support the chat workflow?
The workflow accepts any source ingested into Open-Notebook's knowledge base. The ContextBuilder in open_notebook/utils/context_builder.py handles PDFs, web pages, YouTube transcripts, and plain text uniformly. The system extracts raw text and insights during ingestion, then retrieves these structures during chat context assembly. Video sources utilize transcript data rather than visual analysis.
Can I use different AI models for different source chats?
Yes. The provision_langchain_model function checks for a model key in the session configuration before falling back to system defaults. Users can override the model per-session through the API, enabling lightweight models for simple queries and larger models for complex analysis. All supported providers (OpenAI, Anthropic, local LLMs) work through the unified Esperanto interface.
Why does the workflow use separate event loops for database calls?
The synchronous LangGraph execution engine risks deadlock when awaiting async database operations. By spawning new event loops for context building (context_builder.build()) and model invocation, the workflow isolates blocking I/O from the graph's main thread. This pattern, visible in lines 66-73 and 33-45 of source_chat.py, ensures responsive state management regardless of database latency or model response times.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →