how-to-guide

How to Perform Vector Search for Semantically Similar Content Across Notebooks

June 7, 2026 lfnovo/open-notebook ↗

Use the POST /search API endpoint with "type": "vector" to query embedded notebook content, which internally calls generate_embedding() to vectorize your query and executes SurrealDB's fn::vector_search to return ranked results based on cosine similarity.

The open-source Open Notebook project (lfnovo/open-notebook) provides semantic search capabilities that transcend traditional keyword matching. By persisting dense vector embeddings alongside your content, the system enables vector search to surface conceptually related information across sources and notes, even when the terminology differs between query and content.

Prerequisites for Vector Search

Before executing semantic queries, ensure your environment meets two requirements. First, configure an embedding model via the Models configuration section or environment variables (OPEN_NOTEBOOK_EMBEDDING_MODEL). Second, verify that target content has been vectorized—embeddings are generated automatically when sources are created with embed=True or manually via the source.vectorize() command implemented in commands/embedding_commands.py.

How Vector Search Works in Open Notebook

The vector search pipeline operates through three coordinated layers: query vectorization, SurrealDB storage, and similarity execution.

Query Embedding Generation

When you submit a vector search request, the system invokes generate_embedding() from open_notebook/utils/embedding.py (lines 9-74). This function tokenizes your query text, automatically chunks content exceeding CHUNK_SIZE, obtains embeddings from the configured provider via ModelManager, and mean-pools chunk embeddings into a single normalized vector suitable for comparison.

SurrealDB Vector Storage

Every source and note record in SurrealDB maintains an embedding field populated during the vectorization process. The embed_source command in commands/embedding_commands.py orchestrates this storage, ensuring that high-dimensional vectors persist alongside metadata for efficient retrieval.

The Search Execution Pipeline

The core search logic resides in vector_search() within open_notebook/domain/notebook.py (lines 50-78). This function constructs a SurrealQL call to the built-in fn::vector_search procedure, passing the query vector, result limit, content type filters (source, note), and optional minimum_score threshold. The repo_query utility executes this asynchronously against SurrealDB, returning records ranked by cosine similarity.

Performing Vector Search via the REST API

The FastAPI router in api/routers/search.py exposes vector search through the /search endpoint. Submit a POST request with the search type set to "vector":

import httpx

BASE_URL = "http://localhost:5055"
payload = {
    "type": "vector",
    "keyword": "neural network architecture patterns",
    "results": 10,
    "source": True,
    "note": True,
    "minimum_score": 0.25
}

response = httpx.post(f"{BASE_URL}/search", json=payload)
results = response.json()

for match in results:
    print(f"{match['id']}: {match.get('title', 'Untitled')} (score: {match.get('score')})")

The endpoint validates the request against the SearchRequest model, routes vector queries through the domain layer, and returns serialized matches containing content metadata and similarity scores.

Running Vector Search Directly in Python

For background jobs or internal tooling, bypass the HTTP layer and invoke the domain function directly:

import asyncio
from open_notebook.domain.notebook import vector_search

async def semantic_query():
    matches = await vector_search(
        keyword="distributed systems consensus algorithms",
        results=5,
        source=True,
        note=False,
        minimum_score=0.3
    )
    return matches

results = asyncio.run(semantic_query())

This approach utilizes the same generate_embedding() and SurrealDB query pipeline without network overhead.

Configuring Embedding Generation and Chunking

The embedding utility handles large inputs intelligently. When queries exceed the model's context window, generate_embedding() splits text into chunks, embeds each separately, then applies mean_pool_embeddings() to aggregate vectors. Configure chunk size and model selection through open_notebook/ai/models.py and open_notebook/ai/provision.py, which interface with the Esperanto library to support providers like OpenAI and Anthropic.

Summary

Vector search in Open Notebook relies on stored embeddings in SurrealDB to find semantically similar content across sources and notes.
The generate_embedding() function in open_notebook/utils/embedding.py handles query vectorization, including automatic chunking and mean-pooling for large texts.
vector_search() in open_notebook/domain/notebook.py executes SurrealDB's fn::vector_search procedure with optional filters for content types and minimum similarity scores.
The /search endpoint in api/routers/search.py provides RESTful access, accepting JSON payloads to toggle between text and vector search modes.
Embeddings are generated during content ingestion via embed_source commands and persisted alongside records for efficient retrieval.

Frequently Asked Questions

What embedding models support vector search in Open Notebook?

Open Notebook supports any embedding model compatible with the Esperanto library abstraction layer, including OpenAI's text-embedding-3-small, text-embedding-3-large, and Anthropic's embedding models. Configure your preferred model in open_notebook/ai/models.py or through environment variables.

How does the system handle queries that exceed the model's token limit?

The generate_embedding() function automatically tokenizes input and compares against CHUNK_SIZE. When queries exceed this threshold, the function splits text into chunks, generates embeddings for each chunk, then uses mean_pool_embeddings() to average the vectors into a single representative embedding.

Can I restrict vector search results to specific content types?

Yes. The vector_search() function accepts boolean parameters source and note that map to the SurrealQL query filters. When calling the REST API, include "source": true or "note": false in your JSON payload to include or exclude specific record types from results.

Where are vector embeddings physically stored?

Embeddings persist as array fields within SurrealDB records for each source and note. The repo_query utility executes SurrealQL statements that utilize SurrealDB's native fn::vector_search function, performing cosine similarity calculations directly within the database engine rather than in application memory.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how lfnovo/open-notebook works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →