# How OpenNotebook Handles Vector Embeddings: Storage and Semantic Search in `embedding_service.py`

> Learn how OpenNotebook's embedding_service.py handles vector embeddings, storing dense vectors in SurrealDB and enabling semantic search with vector-distance queries.

- Repository: [Luis Novo/open-notebook](https://github.com/lfnovo/open-notebook)
- Tags: internals
- Published: 2026-06-06

---

**OpenNotebook routes embedding requests through [`api/embedding_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/embedding_service.py) to a FastAPI router that validates the request, dispatches either an async background job or a synchronous domain-model call, and ultimately persists dense vectors in SurrealDB for later semantic retrieval via vector-distance queries.**

The `lfnovo/open-notebook` repository implements a complete pipeline for vector embeddings storage and semantic search that turns raw sources and user-written notes into queryable dense vectors. At the heart of this pipeline sits [`api/embedding_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/embedding_service.py), a thin façade that forwards requests through the API client to the embedding router and underlying domain models. Understanding how these layers interact reveals exactly how the application converts text into retrievable vectors and executes meaning-based lookups.

## The Embedding Service Façade in [`api/embedding_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/embedding_service.py)

The public entry point for all embedding operations is the `EmbeddingService` singleton defined in [`api/embedding_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/embedding_service.py). Its `embed_content(self, item_id, item_type)` method acts as a thin wrapper around the low-level API client.

According to the source code, this method delegates to `api_client.embed_content()` (lines 18‑23). The underlying client builds a JSON payload containing `item_id`, `item_type`, and an optional `async_processing` flag, then POSTs it to the `/api/embed` endpoint.

```python
from api.embedding_service import embedding_service

# Embed source with ID "src_123"

response = embedding_service.embed_content(item_id="src_123", item_type="source")
print(response)   # → {'success': True, 'item_id': 'src_123', …, 'command_id': 'cmd_9a…'}

```

## Router Validation and Async or Sync Dispatch

The FastAPI router in [`api/routers/embedding.py`](https://github.com/lfnovo/open-notebook/blob/main/api/routers/embedding.py) consumes the `EmbedRequest` and enforces two critical prerequisites before any vector computation occurs. It verifies that an embedding model is configured and that the requested `item_type` is either `"source"` or `"note"`.

- **Model availability**: The router calls `model_manager.get_embedding_model()` and rejects the request if no embedding model is configured (lines 16‑22).
- **Type validation**: The `item_type` field must be `"source"` or `"note"`; any other value is rejected (lines 26‑30).

Once validated, the router selects between async and synchronous processing paths based on the `async_processing` flag. Both paths ultimately return an `EmbedResponse` that includes `item_id`, `item_type`, a success flag, and the background `command_id`.

### Async Background Processing

When `async_processing=True` (lines 33‑55), the router submits a background command—either `embed_source` or `embed_note`—via `CommandService.submit_command_job`. It immediately returns an `EmbedResponse` containing a `command_id` the caller can poll for status.

### Synchronous Domain-Model Processing

When processing synchronously (lines 71‑99), the router loads the domain object via `Source.get()` or `Note.get()` and invokes the model’s embedding helper—`Source.vectorize()` or `Note.save()`. These helpers internally enqueue the same background command but surface the `command_id` right away. The router then returns an `EmbedResponse` that bundles `item_id`, `item_type`, a success flag, and the `command_id` (lines 57‑64 and 97‑103).

```python
from api.embedding_service import embedding_service

# Queue embedding; return immediately with a command ID

response = embedding_service.embed_content(
    item_id="note_456",
    item_type="note",
    async_processing=True,
)
print(response["command_id"])   # poll status later via /api/commands/{cmd_id}/status

```

## Vector Storage and Persistence in SurrealDB

The actual vector computation lives in the domain layer. In [`open_notebook/domain/notebook.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/domain/notebook.py), `Source.vectorize()` pulls the raw text from the source record and passes it to the configured Esperanto embedding model through `model_manager.get_embedding_model().embed(text)`.

The resulting dense vector is written directly into the SurrealDB record under a field such as `embedding`. Because the vector resides as a native database field, SurrealDB can execute native vector-similarity operations without moving data to an external store.

The `Note.save()` method behaves analogously. After persisting the note, it triggers `embed_note`, which computes the note’s embedding and stores it in the same SurrealDB-backed format.

## Semantic Search Retrieval Against Stored Embeddings

Retrieval is handled by [`api/search_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/search_service.py). The search service receives a plain-text query, transforms it into a dense vector using the same embedding model, and issues a SurrealDB vector-distance query against the stored embeddings.

The underlying query leverages SurrealDB’s native `vector_distance(embedding, $query_vector)` function. It typically bounds the distance by a threshold to return the most semantically similar sources or notes rather than relying on exact keyword matching.

```python
from api.search_service import SearchService

results = SearchService.semantic_search(query="deep learning advances", top_k=5)
for hit in results:
    print(hit["title"], hit["score"])

```

## Summary

- [`api/embedding_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/embedding_service.py) provides a singleton façade that forwards embedding requests to the back-end via [`api/client.py`](https://github.com/lfnovo/open-notebook/blob/main/api/client.py).
- [`api/routers/embedding.py`](https://github.com/lfnovo/open-notebook/blob/main/api/routers/embedding.py) validates requests, ensures an embedding model is configured, and branches between async background jobs and synchronous domain-model execution.
- `Source.vectorize()` and `Note.save()` in [`open_notebook/domain/notebook.py`](https://github.com/lfnovo/open-notebook/blob/main/open_notebook/domain/notebook.py) compute dense vectors through the Esperanto model and persist them as native SurrealDB fields.
- [`api/search_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/search_service.py) converts query text into a vector and performs semantic retrieval using SurrealDB vector-distance queries.

## Frequently Asked Questions

### What is the role of [`embedding_service.py`](https://github.com/lfnovo/open-notebook/blob/main/embedding_service.py) in the vector pipeline?

It acts as a thin singleton wrapper. The `EmbeddingService.embed_content()` method delegates to `api_client.embed_content()`, which POSTs the request to the `/api/embed` endpoint so the router and domain layers can handle validation and vectorization.

### Does OpenNotebook process embeddings synchronously or asynchronously?

Both. The router in [`api/routers/embedding.py`](https://github.com/lfnovo/open-notebook/blob/main/api/routers/embedding.py) supports an `async_processing` flag. When `True`, it queues a background command via `CommandService.submit_command_job`. When `False`, it loads the domain model and triggers `vectorize()` or `save()`, which still return a command ID immediately while the work completes.

### Where are the vector embeddings physically stored?

They are stored as native fields inside SurrealDB records. The domain models write the computed dense vectors directly into fields such as `embedding` on the `source` or `note` records, enabling the database to execute native vector-distance queries.

### How does semantic search differ from keyword search in OpenNotebook?

Semantic search uses [`api/search_service.py`](https://github.com/lfnovo/open-notebook/blob/main/api/search_service.py) to embed the query text into the same vector space as the stored content, then queries SurrealDB with `vector_distance(embedding, $query_vector)`. This retrieves results based on conceptual meaning rather than exact keyword matches.