SurrealDB Schema Migration and Relationship Modeling in Open-Notebook

Open-Notebook implements a versioned schema migration system using SurrealDB's native graph relationships, managed through synchronous and asynchronous Python wrappers that automatically execute on API startup.

The lfnovo/open-notebook project stores all application data in SurrealDB, a graph-oriented database that supports first-class relationships. The codebase maintains schema evolution through a structured migration process while leveraging SurrealDB's RELATION type to model connections between sources, notes, and notebooks.

How SurrealDB Schema Migrations Work in Open-Notebook

The migration system splits responsibilities between a synchronous wrapper and an asynchronous engine, ensuring the FastAPI application can initialize migrations without handling async complexity directly.

Migration Architecture Overview

In open_notebook/database/migrate.py, the MigrationManager class provides a synchronous interface that wraps the async implementation. It instantiates AsyncMigrationManager and exposes three key methods: get_current_version(), needs_migration, and run_migration_up().

The underlying async logic lives in open_notebook/database/async_migrate.py. Here, the AsyncMigration class loads individual SurrealQL files, strips comments, and stores the executable SQL. The AsyncMigrationRunner executes these migrations sequentially, while AsyncMigrationManager coordinates the full migration list from the open_notebook/database/migrations/ directory.

Version Tracking in _sbl_migrations

SurrealDB maintains migration state in a dedicated table named _sbl_migrations. The system tracks versions as integers, with helper functions managing state transitions:

  • bump_version() inserts a new record representing the next version number after successful migration
  • lower_version() removes the latest entry when rolling back (1_down.surrealql)

AsyncMigrationManager queries this table through get_latest_version() and get_all_versions() to determine pending migrations before execution.

Automatic Migration on API Startup

The FastAPI entry point at api/main.py instantiates AsyncMigrationManager during application startup and calls run_migration_up(). If the database schema matches the latest migration version, the manager logs that no work is required, ensuring idempotent startup behavior.

from open_notebook.database.migrate import MigrationManager

mgr = MigrationManager()
if mgr.needs_migration:
    print(f"Current DB version: {mgr.get_current_version()}")
    mgr.run_migration_up()
    print(f"After migration DB version: {mgr.get_current_version()}")
else:
    print("Database schema is up-to-date.")

Modeling Relationships as Graph Edges

Open-Notebook leverages SurrealDB's graph capabilities through typed RELATION tables that function as native edge records between entity nodes.

Native RELATION Tables

The initial migration file open_notebook/database/migrations/1.surrealql defines two critical relationship tables using SurrealQL's DEFINE TABLE with TYPE RELATION:

DEFINE TABLE reference TYPE RELATION FROM source TO notebook;
DEFINE TABLE artifact TYPE RELATION FROM note TO notebook;

These declarations create first-class edge records:

  • reference: Links a source (PDF or web page) to one or more notebook records
  • artifact: Connects a note (generated summary) to its parent notebook

Because these are native SurrealDB relations, you can traverse them efficiently using the in and out fields, filter directionally, and query them as standard tables.

Cascading Deletes and Data Integrity

The schema implements database-level events to maintain graph integrity. When a source record is deleted, the source_delete event automatically removes associated source_embedding and source_insight records, preventing orphaned nodes in the knowledge graph.

Practical Implementation Examples

Creating relationships requires inserting records into the defined relation tables using the in and out fields to specify direction:

from open_notebook.database.repository import repo_query

async def link_source_to_notebook(source_id: str, notebook_id: str):
    await repo_query(
        "CREATE reference SET out = $source, in = $notebook;",
        {"source": source_id, "notebook": notebook_id}
    )

Querying incoming relationships filters by the in field to find all sources attached to a specific notebook:

async def list_notebook_sources(notebook_id: str):
    result = await repo_query(
        "SELECT out FROM reference WHERE in = $nb;",
        {"nb": notebook_id}
    )
    return [r["out"] for r in result]

Summary

  • Open-Notebook uses a two-layer migration system: MigrationManager provides synchronous convenience methods while AsyncMigrationManager in async_migrate.py handles the actual SurrealQL execution.
  • Migration state persists in the _sbl_migrations table with integer versioning, supporting both upgrades via bump_version() and rollbacks via lower_version().
  • The API automatically runs pending migrations on startup through api/main.py, ensuring the schema is current before handling requests.
  • Relationships are modeled as native SurrealDB RELATION tables (reference and artifact) with explicit FROM and TO constraints, enabling efficient graph traversal and maintaining referential integrity through database events.

Frequently Asked Questions

How does Open-Notebook handle database schema versioning?

Open-Notebook tracks schema versions in a dedicated SurrealDB table called _sbl_migrations. The AsyncMigrationManager queries this table to determine the current version and compares it against available migration files in open_notebook/database/migrations/. When run_migration_up() is called, it executes only the migrations that have not yet been recorded, then calls bump_version() to log the new version number.

What is the difference between migrate.py and async_migrate.py?

The migrate.py file contains MigrationManager, a synchronous wrapper that shields the FastAPI application from async complexity. The async_migrate.py file contains the actual migration engine, including AsyncMigration for loading SurrealQL files and AsyncMigrationRunner for executing them. This separation allows the API startup code to call simple synchronous methods while the underlying implementation handles async database operations.

How are many-to-many relationships implemented between notebooks and sources?

Relationships are implemented as edge tables with TYPE RELATION. The reference table uses DEFINE TABLE reference TYPE RELATION FROM source TO notebook to create a native graph edge. Each relationship record contains in (the notebook) and out (the source) fields, allowing bidirectional traversal. This approach avoids foreign key joins and enables efficient graph queries directly in SurrealQL.

What happens if a migration fails during API startup?

If a migration fails during execution in api/main.py, the error propagates from AsyncMigrationRunner and prevents the FastAPI application from completing startup. This fail-fast behavior ensures the database schema never reaches an inconsistent partial state. The migration system does not implement automatic rollback on failure; administrators must manually execute the corresponding _down.surrealql file to revert changes before restarting the application.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →