How the Insights Service Generates and Retrieves Source Insights in Open-Notebook

Open-Notebook's insights service transforms raw source material into AI-generated summaries by asynchronously processing content through a FastAPI backend, storing results in SurrealDB, and exposing them via REST endpoints consumed by React hooks.

The insights service in open-notebook automates the extraction of key takeaways from uploaded sources like PDFs, web pages, and audio files. It combines a multi-provider LLM wrapper (Esperanto) with a SurrealDB backend to generate concise summaries on demand, then retrieves them through a type-safe API layer. Understanding this flow reveals how the application maintains responsiveness while processing large documents through external AI providers.

Architecture Overview

The service follows a three-tier asynchronous architecture:

  1. Frontend Layer: React hooks in frontend/src/lib/hooks/use-insights.ts orchestrate user requests and state management
  2. API Layer: FastAPI routers in api/routers/insights.py handle HTTP validation and routing
  3. Service Layer: api/insights_service.py contains the core business logic for LLM interaction and database persistence

All database operations use SurrealDB for document storage, while the Esperanto abstraction selects the optimal language model based on source characteristics and configured credentials.

The Generation Flow

When a user opens a source document, the system triggers a multi-step generation process:

Frontend Request Initiation

The useInsights hook calls fetchInsights from frontend/src/lib/api/insights.ts, which constructs a POST request to /api/insights with the source ID in the JSON body.

// frontend/src/lib/hooks/use-insights.ts
import { useMutation, useQuery } '@tanstack/react-query';
import { fetchInsights } from '@/lib/api/insights';

export function useInsights(sourceId: string) {
  const generateInsight = useMutation({
    mutationFn: () => fetchInsights({ sourceId }),
    // Updates Zustand store on success
  });
  
  return { generateInsight };
}

API Router Validation

The insights.py router validates the incoming payload using Pydantic schemas before delegating to the service layer:


# api/routers/insights.py

from fastapi import APIRouter
from .insights_service import InsightsService

router = APIRouter()

@router.post("/insights")
async def create_insight(payload: InsightRequest):
    """Trigger insight generation for a specific source."""
    await InsightsService.generate_insight(payload.source_id)
    return {"status": "queued"}

LLM Processing and Storage

The InsightsService.generate_insight() method in api/insights_service.py performs three critical operations:

  • Retrieval: Fetches the source document from SurrealDB using await db.select(Source, source_id)
  • Generation: Extracts full_text and calls esperanto.chat() with a summarization prompt
  • Persistence: Creates an Insight record linking the generated text to the source ID

# api/insights_service.py

class InsightsService:
    @staticmethod
    async def generate_insight(source_id: str) -> None:
        """Generate AI summary and persist to database."""
        source = await db.select(Source, source_id)
        text = source.full_text
        
        prompt = f"Summarise the key take‑aways from the following text:\n\n{text}"
        llm_response = await esperanto.chat(prompt, model="best")
        
        await db.create(Insight(
            source_id=source_id, 
            text=llm_response,
            created_at=datetime.utcnow()
        ))

The Retrieval Flow

Subsequent requests for existing insights follow a read-optimized path:

  1. The frontend calls GET /insights?sourceId=... through the use-insights hook
  2. api/routers/insights.py routes to InsightsService.get_insights(source_id)
  3. The service queries the Insight table using await db.query(Insight).filter(source_id=source_id).all()
  4. Results return as JSON and populate the React component state

# api/routers/insights.py

@router.get("/insights")
async def list_insights(source_id: str):
    """Retrieve all insights for a given source."""
    return await InsightsService.get_insights(source_id)

Key Implementation Details

Provider Agnosticism: The Esperanto wrapper automatically selects between OpenAI GPT-4o, Anthropic Claude, or other configured providers based on token limits and source size, eliminating vendor lock-in.

Asynchronous Processing: All service methods use async def with await for database and LLM calls, preventing UI blocking during long-running generation tasks.

Document Relationships: Insights maintain foreign key relationships to their parent Source records in SurrealDB, enabling efficient filtered queries without text search overhead.

Type Safety: The TypeScript frontend uses generated types matching the Pydantic schemas in api/routers/insights.py, ensuring API contract consistency across the stack.

Summary

  • The insights service uses a three-layer architecture: React hooks → FastAPI routers → Service layer with SurrealDB persistence
  • Generation occurs in api/insights_service.py through generate_insight(), which retrieves source text, calls esperanto.chat(), and stores the result
  • Retrieval happens via GET /insights handled by get_insights() in the same service file, returning filtered database records
  • All operations are asynchronous, keeping the UI responsive while processing large documents through external LLM providers
  • File locations: Frontend logic in frontend/src/lib/hooks/use-insights.ts and frontend/src/lib/api/insights.ts; backend logic in api/routers/insights.py and api/insights_service.py

Frequently Asked Questions

How does the insights service handle different file types like PDFs versus audio?

The service operates on extracted text rather than raw files. When a source uploads, separate extraction services (content-core) process PDFs, audio, or web pages into plain text stored in the Source.full_text field. The insights service in api/insights_service.py reads this standardized text representation, making it agnostic to the original file format.

What happens if the LLM provider is rate limited or fails?

The esperanto.chat() call in api/insights_service.py includes error handling and retry logic at the wrapper level. If the primary provider fails, Esperanto can failover to backup credentials configured in the project settings. The async design ensures the API returns immediately while the generation happens in the background, with the frontend polling or using webhooks to detect completion.

Can I customize the prompt used for generating insights?

Yes, the prompt template in InsightsService.generate_insight() can be modified in api/insights_service.py. The current implementation uses a hardcoded summarization prompt, but you can extend the Insight model to include custom prompts per source or project, passing them as parameters to esperanto.chat() instead of the default text.

Where are the generated insights stored and how long do they persist?

Insights persist as documents in SurrealDB with a schema linking them to their parent source via source_id. They remain in the database indefinitely unless explicitly deleted, allowing users to retrieve historical summaries instantly without re-invoking costly LLM calls. The get_insights() method queries these stored documents rather than regenerating them.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →