SurrealDB Schema Migration and Relationship Modeling in Open-Notebook
Open-Notebook implements a versioned schema migration system using SurrealDB's native graph relationships, managed through synchronous and asynchronous Python wrappers that automatically execute on API startup.
The lfnovo/open-notebook project stores all application data in SurrealDB, a graph-oriented database that supports first-class relationships. The codebase maintains schema evolution through a structured migration process while leveraging SurrealDB's RELATION type to model connections between sources, notes, and notebooks.
How SurrealDB Schema Migrations Work in Open-Notebook
The migration system splits responsibilities between a synchronous wrapper and an asynchronous engine, ensuring the FastAPI application can initialize migrations without handling async complexity directly.
Migration Architecture Overview
In open_notebook/database/migrate.py, the MigrationManager class provides a synchronous interface that wraps the async implementation. It instantiates AsyncMigrationManager and exposes three key methods: get_current_version(), needs_migration, and run_migration_up().
The underlying async logic lives in open_notebook/database/async_migrate.py. Here, the AsyncMigration class loads individual SurrealQL files, strips comments, and stores the executable SQL. The AsyncMigrationRunner executes these migrations sequentially, while AsyncMigrationManager coordinates the full migration list from the open_notebook/database/migrations/ directory.
Version Tracking in _sbl_migrations
SurrealDB maintains migration state in a dedicated table named _sbl_migrations. The system tracks versions as integers, with helper functions managing state transitions:
bump_version()inserts a new record representing the next version number after successful migrationlower_version()removes the latest entry when rolling back (1_down.surrealql)
AsyncMigrationManager queries this table through get_latest_version() and get_all_versions() to determine pending migrations before execution.
Automatic Migration on API Startup
The FastAPI entry point at api/main.py instantiates AsyncMigrationManager during application startup and calls run_migration_up(). If the database schema matches the latest migration version, the manager logs that no work is required, ensuring idempotent startup behavior.
from open_notebook.database.migrate import MigrationManager
mgr = MigrationManager()
if mgr.needs_migration:
print(f"Current DB version: {mgr.get_current_version()}")
mgr.run_migration_up()
print(f"After migration DB version: {mgr.get_current_version()}")
else:
print("Database schema is up-to-date.")
Modeling Relationships as Graph Edges
Open-Notebook leverages SurrealDB's graph capabilities through typed RELATION tables that function as native edge records between entity nodes.
Native RELATION Tables
The initial migration file open_notebook/database/migrations/1.surrealql defines two critical relationship tables using SurrealQL's DEFINE TABLE with TYPE RELATION:
DEFINE TABLE reference TYPE RELATION FROM source TO notebook;
DEFINE TABLE artifact TYPE RELATION FROM note TO notebook;
These declarations create first-class edge records:
- reference: Links a
source(PDF or web page) to one or morenotebookrecords - artifact: Connects a
note(generated summary) to its parentnotebook
Because these are native SurrealDB relations, you can traverse them efficiently using the in and out fields, filter directionally, and query them as standard tables.
Cascading Deletes and Data Integrity
The schema implements database-level events to maintain graph integrity. When a source record is deleted, the source_delete event automatically removes associated source_embedding and source_insight records, preventing orphaned nodes in the knowledge graph.
Practical Implementation Examples
Creating relationships requires inserting records into the defined relation tables using the in and out fields to specify direction:
from open_notebook.database.repository import repo_query
async def link_source_to_notebook(source_id: str, notebook_id: str):
await repo_query(
"CREATE reference SET out = $source, in = $notebook;",
{"source": source_id, "notebook": notebook_id}
)
Querying incoming relationships filters by the in field to find all sources attached to a specific notebook:
async def list_notebook_sources(notebook_id: str):
result = await repo_query(
"SELECT out FROM reference WHERE in = $nb;",
{"nb": notebook_id}
)
return [r["out"] for r in result]
Summary
- Open-Notebook uses a two-layer migration system:
MigrationManagerprovides synchronous convenience methods whileAsyncMigrationManagerinasync_migrate.pyhandles the actual SurrealQL execution. - Migration state persists in the
_sbl_migrationstable with integer versioning, supporting both upgrades viabump_version()and rollbacks vialower_version(). - The API automatically runs pending migrations on startup through
api/main.py, ensuring the schema is current before handling requests. - Relationships are modeled as native SurrealDB
RELATIONtables (referenceandartifact) with explicitFROMandTOconstraints, enabling efficient graph traversal and maintaining referential integrity through database events.
Frequently Asked Questions
How does Open-Notebook handle database schema versioning?
Open-Notebook tracks schema versions in a dedicated SurrealDB table called _sbl_migrations. The AsyncMigrationManager queries this table to determine the current version and compares it against available migration files in open_notebook/database/migrations/. When run_migration_up() is called, it executes only the migrations that have not yet been recorded, then calls bump_version() to log the new version number.
What is the difference between migrate.py and async_migrate.py?
The migrate.py file contains MigrationManager, a synchronous wrapper that shields the FastAPI application from async complexity. The async_migrate.py file contains the actual migration engine, including AsyncMigration for loading SurrealQL files and AsyncMigrationRunner for executing them. This separation allows the API startup code to call simple synchronous methods while the underlying implementation handles async database operations.
How are many-to-many relationships implemented between notebooks and sources?
Relationships are implemented as edge tables with TYPE RELATION. The reference table uses DEFINE TABLE reference TYPE RELATION FROM source TO notebook to create a native graph edge. Each relationship record contains in (the notebook) and out (the source) fields, allowing bidirectional traversal. This approach avoids foreign key joins and enables efficient graph queries directly in SurrealQL.
What happens if a migration fails during API startup?
If a migration fails during execution in api/main.py, the error propagates from AsyncMigrationRunner and prevents the FastAPI application from completing startup. This fail-fast behavior ensures the database schema never reaches an inconsistent partial state. The migration system does not implement automatic rollback on failure; administrators must manually execute the corresponding _down.surrealql file to revert changes before restarting the application.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →