How the Insights Service Generates and Retrieves Source Insights in Open-Notebook
Open-Notebook's insights service transforms raw source material into AI-generated summaries by asynchronously processing content through a FastAPI backend, storing results in SurrealDB, and exposing them via REST endpoints consumed by React hooks.
The insights service in open-notebook automates the extraction of key takeaways from uploaded sources like PDFs, web pages, and audio files. It combines a multi-provider LLM wrapper (Esperanto) with a SurrealDB backend to generate concise summaries on demand, then retrieves them through a type-safe API layer. Understanding this flow reveals how the application maintains responsiveness while processing large documents through external AI providers.
Architecture Overview
The service follows a three-tier asynchronous architecture:
- Frontend Layer: React hooks in
frontend/src/lib/hooks/use-insights.tsorchestrate user requests and state management - API Layer: FastAPI routers in
api/routers/insights.pyhandle HTTP validation and routing - Service Layer:
api/insights_service.pycontains the core business logic for LLM interaction and database persistence
All database operations use SurrealDB for document storage, while the Esperanto abstraction selects the optimal language model based on source characteristics and configured credentials.
The Generation Flow
When a user opens a source document, the system triggers a multi-step generation process:
Frontend Request Initiation
The useInsights hook calls fetchInsights from frontend/src/lib/api/insights.ts, which constructs a POST request to /api/insights with the source ID in the JSON body.
// frontend/src/lib/hooks/use-insights.ts
import { useMutation, useQuery } '@tanstack/react-query';
import { fetchInsights } from '@/lib/api/insights';
export function useInsights(sourceId: string) {
const generateInsight = useMutation({
mutationFn: () => fetchInsights({ sourceId }),
// Updates Zustand store on success
});
return { generateInsight };
}
API Router Validation
The insights.py router validates the incoming payload using Pydantic schemas before delegating to the service layer:
# api/routers/insights.py
from fastapi import APIRouter
from .insights_service import InsightsService
router = APIRouter()
@router.post("/insights")
async def create_insight(payload: InsightRequest):
"""Trigger insight generation for a specific source."""
await InsightsService.generate_insight(payload.source_id)
return {"status": "queued"}
LLM Processing and Storage
The InsightsService.generate_insight() method in api/insights_service.py performs three critical operations:
- Retrieval: Fetches the source document from SurrealDB using
await db.select(Source, source_id) - Generation: Extracts
full_textand callsesperanto.chat()with a summarization prompt - Persistence: Creates an
Insightrecord linking the generated text to the source ID
# api/insights_service.py
class InsightsService:
@staticmethod
async def generate_insight(source_id: str) -> None:
"""Generate AI summary and persist to database."""
source = await db.select(Source, source_id)
text = source.full_text
prompt = f"Summarise the key take‑aways from the following text:\n\n{text}"
llm_response = await esperanto.chat(prompt, model="best")
await db.create(Insight(
source_id=source_id,
text=llm_response,
created_at=datetime.utcnow()
))
The Retrieval Flow
Subsequent requests for existing insights follow a read-optimized path:
- The frontend calls
GET /insights?sourceId=...through theuse-insightshook api/routers/insights.pyroutes toInsightsService.get_insights(source_id)- The service queries the
Insighttable usingawait db.query(Insight).filter(source_id=source_id).all() - Results return as JSON and populate the React component state
# api/routers/insights.py
@router.get("/insights")
async def list_insights(source_id: str):
"""Retrieve all insights for a given source."""
return await InsightsService.get_insights(source_id)
Key Implementation Details
Provider Agnosticism: The Esperanto wrapper automatically selects between OpenAI GPT-4o, Anthropic Claude, or other configured providers based on token limits and source size, eliminating vendor lock-in.
Asynchronous Processing: All service methods use async def with await for database and LLM calls, preventing UI blocking during long-running generation tasks.
Document Relationships: Insights maintain foreign key relationships to their parent Source records in SurrealDB, enabling efficient filtered queries without text search overhead.
Type Safety: The TypeScript frontend uses generated types matching the Pydantic schemas in api/routers/insights.py, ensuring API contract consistency across the stack.
Summary
- The insights service uses a three-layer architecture: React hooks → FastAPI routers → Service layer with SurrealDB persistence
- Generation occurs in
api/insights_service.pythroughgenerate_insight(), which retrieves source text, callsesperanto.chat(), and stores the result - Retrieval happens via
GET /insightshandled byget_insights()in the same service file, returning filtered database records - All operations are asynchronous, keeping the UI responsive while processing large documents through external LLM providers
- File locations: Frontend logic in
frontend/src/lib/hooks/use-insights.tsandfrontend/src/lib/api/insights.ts; backend logic inapi/routers/insights.pyandapi/insights_service.py
Frequently Asked Questions
How does the insights service handle different file types like PDFs versus audio?
The service operates on extracted text rather than raw files. When a source uploads, separate extraction services (content-core) process PDFs, audio, or web pages into plain text stored in the Source.full_text field. The insights service in api/insights_service.py reads this standardized text representation, making it agnostic to the original file format.
What happens if the LLM provider is rate limited or fails?
The esperanto.chat() call in api/insights_service.py includes error handling and retry logic at the wrapper level. If the primary provider fails, Esperanto can failover to backup credentials configured in the project settings. The async design ensures the API returns immediately while the generation happens in the background, with the frontend polling or using webhooks to detect completion.
Can I customize the prompt used for generating insights?
Yes, the prompt template in InsightsService.generate_insight() can be modified in api/insights_service.py. The current implementation uses a hardcoded summarization prompt, but you can extend the Insight model to include custom prompts per source or project, passing them as parameters to esperanto.chat() instead of the default text.
Where are the generated insights stored and how long do they persist?
Insights persist as documents in SurrealDB with a schema linking them to their parent source via source_id. They remain in the database indefinitely unless explicitly deleted, allowing users to retrieve historical summaries instantly without re-invoking costly LLM calls. The get_insights() method queries these stored documents rather than regenerating them.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →