# How the Insights Service Generates and Retrieves Source Insights in Open-Notebook

> Learn how OpenNotebook's insights service generates and retrieves source insights. Discover its FastAPI backend, SurrealDB storage, and REST API for AI-powered summaries.

- Repository: [Luis Novo/open-notebook](https://github.com/lfnovo/open-notebook)
- Tags: internals
- Published: 2026-06-10

---

**Open-Notebook's insights service transforms raw source material into AI-generated summaries by asynchronously processing content through a FastAPI backend, storing results in SurrealDB, and exposing them via REST endpoints consumed by React hooks.**

The **insights service** in [open-notebook](https://github.com/lfnovo/open-notebook) automates the extraction of key takeaways from uploaded sources like PDFs, web pages, and audio files. It combines a **multi-provider LLM wrapper** (Esperanto) with a SurrealDB backend to generate concise summaries on demand, then retrieves them through a type-safe API layer. Understanding this flow reveals how the application maintains responsiveness while processing large documents through external AI providers.

## Architecture Overview

The service follows a three-tier asynchronous architecture:

1. **Frontend Layer**: React hooks in [`frontend/src/lib/hooks/use-insights.ts`](https://github.com/lfnovo/open-notebook/blob/main/frontend/src/lib/hooks/use-insights.ts) orchestrate user requests and state management
2. **API Layer**: FastAPI routers in [`api/routers/insights.py`](https://github.com/lfnovo/open-notebook/blob/main/api/routers/insights.py) handle HTTP validation and routing
3. **Service Layer**: [`api/insights_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/insights_service.py) contains the core business logic for LLM interaction and database persistence

All database operations use **SurrealDB** for document storage, while the **Esperanto** abstraction selects the optimal language model based on source characteristics and configured credentials.

## The Generation Flow

When a user opens a source document, the system triggers a multi-step generation process:

### Frontend Request Initiation

The `useInsights` hook calls `fetchInsights` from [`frontend/src/lib/api/insights.ts`](https://github.com/lfnovo/open-notebook/blob/main/frontend/src/lib/api/insights.ts), which constructs a `POST` request to `/api/insights` with the source ID in the JSON body.

```typescript
// frontend/src/lib/hooks/use-insights.ts
import { useMutation, useQuery } '@tanstack/react-query';
import { fetchInsights } from '@/lib/api/insights';

export function useInsights(sourceId: string) {
  const generateInsight = useMutation({
    mutationFn: () => fetchInsights({ sourceId }),
    // Updates Zustand store on success
  });
  
  return { generateInsight };
}

```

### API Router Validation

The [`insights.py`](https://github.com/lfnovo/open-notebook/blob/main/insights.py) router validates the incoming payload using Pydantic schemas before delegating to the service layer:

```python

# api/routers/insights.py

from fastapi import APIRouter
from .insights_service import InsightsService

router = APIRouter()

@router.post("/insights")
async def create_insight(payload: InsightRequest):
    """Trigger insight generation for a specific source."""
    await InsightsService.generate_insight(payload.source_id)
    return {"status": "queued"}

```

### LLM Processing and Storage

The `InsightsService.generate_insight()` method in [`api/insights_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/insights_service.py) performs three critical operations:

- **Retrieval**: Fetches the source document from SurrealDB using `await db.select(Source, source_id)`
- **Generation**: Extracts `full_text` and calls `esperanto.chat()` with a summarization prompt
- **Persistence**: Creates an `Insight` record linking the generated text to the source ID

```python

# api/insights_service.py

class InsightsService:
    @staticmethod
    async def generate_insight(source_id: str) -> None:
        """Generate AI summary and persist to database."""
        source = await db.select(Source, source_id)
        text = source.full_text
        
        prompt = f"Summarise the key take‑aways from the following text:\n\n{text}"
        llm_response = await esperanto.chat(prompt, model="best")
        
        await db.create(Insight(
            source_id=source_id, 
            text=llm_response,
            created_at=datetime.utcnow()
        ))

```

## The Retrieval Flow

Subsequent requests for existing insights follow a read-optimized path:

1. The frontend calls `GET /insights?sourceId=...` through the `use-insights` hook
2. [`api/routers/insights.py`](https://github.com/lfnovo/open-notebook/blob/main/api/routers/insights.py) routes to `InsightsService.get_insights(source_id)`
3. The service queries the `Insight` table using `await db.query(Insight).filter(source_id=source_id).all()`
4. Results return as JSON and populate the React component state

```python

# api/routers/insights.py

@router.get("/insights")
async def list_insights(source_id: str):
    """Retrieve all insights for a given source."""
    return await InsightsService.get_insights(source_id)

```

## Key Implementation Details

**Provider Agnosticism**: The **Esperanto** wrapper automatically selects between OpenAI GPT-4o, Anthropic Claude, or other configured providers based on token limits and source size, eliminating vendor lock-in.

**Asynchronous Processing**: All service methods use `async def` with `await` for database and LLM calls, preventing UI blocking during long-running generation tasks.

**Document Relationships**: Insights maintain foreign key relationships to their parent `Source` records in SurrealDB, enabling efficient filtered queries without text search overhead.

**Type Safety**: The TypeScript frontend uses generated types matching the Pydantic schemas in [`api/routers/insights.py`](https://github.com/lfnovo/open-notebook/blob/main/api/routers/insights.py), ensuring API contract consistency across the stack.

## Summary

- **The insights service** uses a three-layer architecture: React hooks → FastAPI routers → Service layer with SurrealDB persistence
- **Generation** occurs in [`api/insights_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/insights_service.py) through `generate_insight()`, which retrieves source text, calls `esperanto.chat()`, and stores the result
- **Retrieval** happens via `GET /insights` handled by `get_insights()` in the same service file, returning filtered database records
- **All operations are asynchronous**, keeping the UI responsive while processing large documents through external LLM providers
- **File locations**: Frontend logic in [`frontend/src/lib/hooks/use-insights.ts`](https://github.com/lfnovo/open-notebook/blob/main/frontend/src/lib/hooks/use-insights.ts) and [`frontend/src/lib/api/insights.ts`](https://github.com/lfnovo/open-notebook/blob/main/frontend/src/lib/api/insights.ts); backend logic in [`api/routers/insights.py`](https://github.com/lfnovo/open-notebook/blob/main/api/routers/insights.py) and [`api/insights_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/insights_service.py)

## Frequently Asked Questions

### How does the insights service handle different file types like PDFs versus audio?

The service operates on extracted text rather than raw files. When a source uploads, separate extraction services (content-core) process PDFs, audio, or web pages into plain text stored in the `Source.full_text` field. The insights service in [`api/insights_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/insights_service.py) reads this standardized text representation, making it agnostic to the original file format.

### What happens if the LLM provider is rate limited or fails?

The `esperanto.chat()` call in [`api/insights_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/insights_service.py) includes error handling and retry logic at the wrapper level. If the primary provider fails, Esperanto can failover to backup credentials configured in the project settings. The async design ensures the API returns immediately while the generation happens in the background, with the frontend polling or using webhooks to detect completion.

### Can I customize the prompt used for generating insights?

Yes, the prompt template in `InsightsService.generate_insight()` can be modified in [`api/insights_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/insights_service.py). The current implementation uses a hardcoded summarization prompt, but you can extend the `Insight` model to include custom prompts per source or project, passing them as parameters to `esperanto.chat()` instead of the default text.

### Where are the generated insights stored and how long do they persist?

Insights persist as documents in **SurrealDB** with a schema linking them to their parent source via `source_id`. They remain in the database indefinitely unless explicitly deleted, allowing users to retrieve historical summaries instantly without re-invoking costly LLM calls. The `get_insights()` method queries these stored documents rather than regenerating them.