internals

How OpenNotebook Handles Vector Embeddings: Storage and Semantic Search in `embedding_service.py`

June 6, 2026 lfnovo/open-notebook ↗

OpenNotebook routes embedding requests through api/embedding_service.py to a FastAPI router that validates the request, dispatches either an async background job or a synchronous domain-model call, and ultimately persists dense vectors in SurrealDB for later semantic retrieval via vector-distance queries.

The lfnovo/open-notebook repository implements a complete pipeline for vector embeddings storage and semantic search that turns raw sources and user-written notes into queryable dense vectors. At the heart of this pipeline sits api/embedding_service.py, a thin façade that forwards requests through the API client to the embedding router and underlying domain models. Understanding how these layers interact reveals exactly how the application converts text into retrievable vectors and executes meaning-based lookups.

The Embedding Service Façade in `api/embedding_service.py`

The public entry point for all embedding operations is the EmbeddingService singleton defined in api/embedding_service.py. Its embed_content(self, item_id, item_type) method acts as a thin wrapper around the low-level API client.

According to the source code, this method delegates to api_client.embed_content() (lines 18‑23). The underlying client builds a JSON payload containing item_id, item_type, and an optional async_processing flag, then POSTs it to the /api/embed endpoint.

from api.embedding_service import embedding_service

# Embed source with ID "src_123"

response = embedding_service.embed_content(item_id="src_123", item_type="source")
print(response)   # → {'success': True, 'item_id': 'src_123', …, 'command_id': 'cmd_9a…'}

Router Validation and Async or Sync Dispatch

The FastAPI router in api/routers/embedding.py consumes the EmbedRequest and enforces two critical prerequisites before any vector computation occurs. It verifies that an embedding model is configured and that the requested item_type is either "source" or "note".

Model availability: The router calls model_manager.get_embedding_model() and rejects the request if no embedding model is configured (lines 16‑22).
Type validation: The item_type field must be "source" or "note"; any other value is rejected (lines 26‑30).

Once validated, the router selects between async and synchronous processing paths based on the async_processing flag. Both paths ultimately return an EmbedResponse that includes item_id, item_type, a success flag, and the background command_id.

Async Background Processing

When async_processing=True (lines 33‑55), the router submits a background command—either embed_source or embed_note—via CommandService.submit_command_job. It immediately returns an EmbedResponse containing a command_id the caller can poll for status.

Synchronous Domain-Model Processing

When processing synchronously (lines 71‑99), the router loads the domain object via Source.get() or Note.get() and invokes the model’s embedding helper—Source.vectorize() or Note.save(). These helpers internally enqueue the same background command but surface the command_id right away. The router then returns an EmbedResponse that bundles item_id, item_type, a success flag, and the command_id (lines 57‑64 and 97‑103).

from api.embedding_service import embedding_service

# Queue embedding; return immediately with a command ID

response = embedding_service.embed_content(
    item_id="note_456",
    item_type="note",
    async_processing=True,
)
print(response["command_id"])   # poll status later via /api/commands/{cmd_id}/status

Vector Storage and Persistence in SurrealDB

The actual vector computation lives in the domain layer. In open_notebook/domain/notebook.py, Source.vectorize() pulls the raw text from the source record and passes it to the configured Esperanto embedding model through model_manager.get_embedding_model().embed(text).

The resulting dense vector is written directly into the SurrealDB record under a field such as embedding. Because the vector resides as a native database field, SurrealDB can execute native vector-similarity operations without moving data to an external store.

The Note.save() method behaves analogously. After persisting the note, it triggers embed_note, which computes the note’s embedding and stores it in the same SurrealDB-backed format.

Semantic Search Retrieval Against Stored Embeddings

Retrieval is handled by api/search_service.py. The search service receives a plain-text query, transforms it into a dense vector using the same embedding model, and issues a SurrealDB vector-distance query against the stored embeddings.

The underlying query leverages SurrealDB’s native vector_distance(embedding, $query_vector) function. It typically bounds the distance by a threshold to return the most semantically similar sources or notes rather than relying on exact keyword matching.

from api.search_service import SearchService

results = SearchService.semantic_search(query="deep learning advances", top_k=5)
for hit in results:
    print(hit["title"], hit["score"])

Summary

api/embedding_service.py provides a singleton façade that forwards embedding requests to the back-end via api/client.py.
api/routers/embedding.py validates requests, ensures an embedding model is configured, and branches between async background jobs and synchronous domain-model execution.
Source.vectorize() and Note.save() in open_notebook/domain/notebook.py compute dense vectors through the Esperanto model and persist them as native SurrealDB fields.
api/search_service.py converts query text into a vector and performs semantic retrieval using SurrealDB vector-distance queries.

Frequently Asked Questions

What is the role of `embedding_service.py` in the vector pipeline?

It acts as a thin singleton wrapper. The EmbeddingService.embed_content() method delegates to api_client.embed_content(), which POSTs the request to the /api/embed endpoint so the router and domain layers can handle validation and vectorization.

Does OpenNotebook process embeddings synchronously or asynchronously?

Both. The router in api/routers/embedding.py supports an async_processing flag. When True, it queues a background command via CommandService.submit_command_job. When False, it loads the domain model and triggers vectorize() or save(), which still return a command ID immediately while the work completes.

Where are the vector embeddings physically stored?

They are stored as native fields inside SurrealDB records. The domain models write the computed dense vectors directly into fields such as embedding on the source or note records, enabling the database to execute native vector-distance queries.

How does semantic search differ from keyword search in OpenNotebook?

Semantic search uses api/search_service.py to embed the query text into the same vector space as the stored content, then queries SurrealDB with vector_distance(embedding, $query_vector). This retrieves results based on conceptual meaning rather than exact keyword matches.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how lfnovo/open-notebook works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →