# Architecture of Turso's Sync Engine for Turso Cloud: A Deep Dive into the Rust Implementation

> Explore the architecture of Turso's sync engine. Discover how this Rust implementation achieves conflict-free, full-duplex sync between SQLite and Turso Cloud with its layered approach.

- Repository: [Turso Database/turso](https://github.com/tursodatabase/turso)
- Tags: architecture
- Published: 2026-06-22

---

**Turso's sync engine is a modular Rust implementation that provides full-duplex, conflict-free synchronization between local SQLite databases and Turso Cloud through a layered architecture separating orchestration, operations, and I/O concerns.**

The sync engine lives in the `sync/` workspace of the [tursodatabase/turso](https://github.com/tursodatabase/turso) repository and serves as the replication backbone for Turso Cloud instances. Written in pure Rust, it coordinates the bidirectional flow of Write-Ahead Log (WAL) frames between embedded SQLite instances and remote Turso Cloud databases, ensuring offline-first capabilities with seamless cloud synchronization.

## Sync Engine Architecture Overview

The architecture of Turso's sync engine deliberately separates concerns into three orthogonal layers to maximize testability and language interoperability:

- **Engine Core**: Orchestrates the sync lifecycle including bootstrap, pull, push, and checkpoint operations via `DatabaseSyncEngine` in [`sync/engine/src/database_sync_engine.rs`](https://github.com/tursodatabase/turso/blob/main/sync/engine/src/database_sync_engine.rs).
- **Operations Layer**: Implements low-level primitives for HTTP API communication, WAL frame manipulation, and CDC log management in [`sync/engine/src/database_sync_operations.rs`](https://github.com/tursodatabase/turso/blob/main/sync/engine/src/database_sync_operations.rs).
- **I/O Abstraction**: Provides an injectable backend for HTTP and filesystem operations through `SyncEngineIo` in [`sync/sdk-kit/src/sync_engine_io.rs`](https://github.com/tursodatabase/turso/blob/main/sync/sdk-kit/src/sync_engine_io.rs), enabling testability and multi-language bindings.

This separation allows the same core logic to power the Rust CLI, JavaScript SDK, and Python bindings without modification to the synchronization algorithms.

## Core Components and Source Files

### DatabaseSyncEngine: The Orchestrator

Located in [`sync/engine/src/database_sync_engine.rs`](https://github.com/tursodatabase/turso/blob/main/sync/engine/src/database_sync_engine.rs), the `DatabaseSyncEngine` struct serves as the central control plane. It holds the I/O implementation (`Arc<dyn turso_core::IO>`), sync-specific I/O wrappers (`SyncEngineIoStats<IO>`), and file paths for the main database, WAL, revert database, and metadata files.

Key lifecycle methods include:

- **`create_db()`**: Entry point used by the CLI and SDK bindings that reads the local metadata file (`<db>-info`), bootstraps empty databases, and initializes `DatabaseStorage`.
- **`pull_changes_from_remote()`**: Executes `wait_changes_from_remote()` followed by `apply_changes_from_remote()`, performing passive checkpoints and frame reconciliation.
- **`push_changes_to_remote()`**: Streams local CDC rows in batches via `send_push_batch()`, posting to the HRANA `/batch` endpoint.
- **`checkpoint()`**: Consolidates the revert database by copying pages newer than the last checkpoint watermark and truncating the main WAL.

Configuration is handled through `DatabaseSyncEngineOpts` (lines 39-78), which specifies remote URL, encryption keys, batch sizes, and protocol versions.

### SyncOperationCtx and Low-Level Primitives

The `SyncOperationCtx` struct in [`sync/engine/src/database_sync_operations.rs`](https://github.com/tursodatabase/turso/blob/main/sync/engine/src/database_sync_operations.rs) (lines 20-28) bundles execution context including the coroutine (`coro: &Coro<Ctx>`), I/O stats collector, and remote connection details. It provides the `http` convenience method (lines 45-62) that automatically injects the `x-turso-encryption-key` header for authenticated requests.

This file implements the heavy lifting for synchronization:

- **`wal_pull_to_file_v1()`** (lines 293-376): Downloads remote WAL frames as protobuf-encoded pages.
- **`apply_changes_from_remote()`**: Re-applies remote frames via `wal_apply_from_file()` and replays local changes that occurred after the last pull.
- **`wal_push_http()`**: Streams local WAL frames to the `/push-wal` endpoint.

### I/O Abstraction Layer

The `SyncEngineIo` trait in [`sync/sdk-kit/src/sync_engine_io.rs`](https://github.com/tursodatabase/turso/blob/main/sync/sdk-kit/src/sync_engine_io.rs) decouples the engine from direct network access. The concrete implementation `SyncEngineIoQueue<TBytes>` queues requests (`SyncEngineIoRequest`) and completions (`SyncEngineIoCompletion<TBytes>`), allowing language bindings to handle actual HTTP and filesystem operations.

This design makes the engine testable via stubbed queues and language-agnostic—JavaScript bindings in [`bindings/javascript/sync/src/lib.rs`](https://github.com/tursodatabase/turso/blob/main/bindings/javascript/sync/src/lib.rs) pop items from this queue and forward them to the host environment's native networking stacks.

### Types and Metadata Management

[`sync/engine/src/types.rs`](https://github.com/tursodatabase/turso/blob/main/sync/engine/src/types.rs) defines the data structures flowing between layers:

- **`DatabaseMetadata`**: JSON-serialized state stored in `<db>-info` files, tracking client UID, last synced revision, and configuration.
- **`DatabasePullRevision`**: Supports `Legacy` (generation + frame number) and `V1` (revision string) protocol variants.
- **`SyncEngineStats`**: Atomic counters for CDC operations, WAL sizes, and network bytes.

## High-Level Sync Control Flow

The synchronization process follows a deterministic five-stage pipeline:

1. **Database Creation**: `DatabaseSyncEngine::create_db()` opens or creates the local database, initializes CDC tracking via `main_tape`, and loads persisted metadata from the JSON metadata file.

2. **Bootstrap**: For empty databases, `bootstrap_db()` builds a `SyncOperationCtx` and calls `db_bootstrap_http()` → `wal_pull_to_file()` to download the complete remote state as protobuf-encoded pages.

3. **Pull Cycle**: `pull_changes_from_remote()` waits for remote changes via `wait_changes_from_remote()`, then executes `apply_changes_from_remote()` which:
   - Runs `checkpoint_passive()` to commit synced frames
   - Rolls back uncommitted local frames
   - Re-applies remote frames via `wal_apply_from_file()`
   - Replays local changes made during the sync window

4. **Push Cycle**: `push_changes_to_remote()` uses `DatabaseReplayGenerator` to translate CDC rows into parameterized SQL statements, applies optional transformations via `apply_transformation()` (lines 1198-1218), and posts batches to the remote `/batch` endpoint.

5. **Checkpoint**: The `checkpoint()` method copies newer pages from the main WAL to the revert database, ensuring the revert DB reflects a consistent rollback state, then truncates the main WAL accordingly.

## WAL Handling and Replication Protocol

The engine operates directly on SQLite WAL frames, respecting `WAL_FRAME_HEADER` and `WAL_FRAME_SIZE` boundaries defined in [`database_sync_operations.rs`](https://github.com/tursodatabase/turso/blob/main/database_sync_operations.rs).

**Remote Frame Application**: During pull operations, `wal_pull_to_file()` writes raw page bytes to temporary files before `wal_apply_from_file()` feeds them to a `DatabaseWalSession`. Frame boundaries are strictly validated through assertions on `WAL_FRAME_SIZE`.

**Local Frame Capture**: The push mechanism reads frames from `wal_session::WalSession` and streams them via HTTP POST to `/push-wal`, ensuring atomic batch delivery.

**Conflict Resolution**: The engine maintains a *revert database*—a shadow copy of the main database updated during checkpoints. If synchronization conflicts arise, the revert DB provides a consistent rollback point without corrupting the main database file.

## Practical Implementation Examples

### Initializing the Sync Engine in Rust

```rust
use turso_sync_engine::database_sync_engine::{DatabaseSyncEngine, DatabaseSyncEngineOpts};
use turso_core::IO;
use std::sync::Arc;

async fn start_sync(io: Arc<dyn IO>, db_path: &str) -> turso_sync_engine::Result<()> {
    let opts = DatabaseSyncEngineOpts {
        remote_url: Some("https://my.turso.cloud".into()),
        client_name: "my-client".into(),
        tables_ignore: vec!["sqlite_sequence".into()],
        use_transform: false,
        wal_pull_batch_size: 1000,
        long_poll_timeout: Some(std::time::Duration::from_secs(30)),
        protocol_version_hint: turso_sync_engine::types::DatabaseSyncEngineProtocolVersion::V1,
        bootstrap_if_empty: true,
        reserved_bytes: 0,
        db_opts: Default::default(),
        partial_sync_opts: None,
        remote_encryption_key: None,
        push_operations_threshold: None,
        pull_bytes_threshold: None,
    };

    let sync_io = sync_sdk_kit::SyncEngineIoQueue::<Vec<u8>>::new();
    let engine = DatabaseSyncEngine::create_db(&Coro::default(), io, sync_io, db_path, opts).await?;
    
    engine.sync(&Coro::default()).await
}

```

### Pulling Changes via JavaScript Bindings

```javascript
import { SyncEngine } from "@tursodatabase/sync";

const engine = await SyncEngine.create({
  dbPath: "/tmp/my.db",
  remoteUrl: "https://my.turso.cloud",
  clientName: "js-client",
  protocolVersion: "v1",
  bootstrapIfEmpty: true,
});

await engine.pullChanges();

```

The binding in [`bindings/javascript/sync/src/lib.rs`](https://github.com/tursodatabase/turso/blob/main/bindings/javascript/sync/src/lib.rs) forwards these calls to `DatabaseSyncEngine::pull_changes_from_remote()`.

### Implementing Custom Transformations

```rust
use turso_sync_engine::types::{DatabaseRowTransformResult, DatabaseTapeRowChange};
use turso_sync_engine::database_sync_operations::apply_transformation;

async fn filter_logs(
    ctx: &SyncOperationCtx<'_, impl SyncEngineIo, MyCtx>,
    changes: &[DatabaseTapeRowChange],
    generator: &DatabaseReplayGenerator,
) -> turso_sync_engine::Result<Vec<DatabaseRowTransformResult>> {
    let mut mutations = Vec::new();
    for change in changes {
        let info = generator.replay_info(ctx.coro, change).await?;
        mutations.push(generator.create_mutation(&info, change)?);
    }

    let completion = ctx.io.transform(mutations)?;
    let transformed = wait_all_results(ctx.coro, &completion, None).await?;

    Ok(transformed
        .into_iter()
        .enumerate()
        .map(|(idx, res)| {
            if let DatabaseRowTransformResult::Keep = res {
                if changes[idx].table_name == "logs"
                    && matches!(changes[idx].change, DatabaseTapeRowChangeType::Insert { .. })
                {
                    DatabaseRowTransformResult::Skip
                } else {
                    DatabaseRowTransformResult::Keep
                }
            } else {
                res
            }
        })
        .collect())
}

```

Enable transformations by setting `use_transform: true` in `DatabaseSyncEngineOpts`.

### Manual Checkpoint Operations

```rust
engine.checkpoint(&Coro::default()).await?;

```

This executes `checkpoint_passive()` and rebuilds the revert database from pages newer than the last checkpoint watermark (lines 876-917 in [`database_sync_engine.rs`](https://github.com/tursodatabase/turso/blob/main/database_sync_engine.rs)).

## Summary

- **Turso's sync engine** uses a three-layer architecture (orchestration, operations, I/O) to synchronize SQLite databases with Turso Cloud while maintaining strict separation of concerns.
- **Core files**: [`database_sync_engine.rs`](https://github.com/tursodatabase/turso/blob/main/database_sync_engine.rs) controls the lifecycle, [`database_sync_operations.rs`](https://github.com/tursodatabase/turso/blob/main/database_sync_operations.rs) handles WAL frames and HTTP communication, and [`sync_engine_io.rs`](https://github.com/tursodatabase/turso/blob/main/sync_engine_io.rs) abstracts I/O for multi-language support.
- **Protocol support**: The engine supports both legacy and V1 revision protocols, tracking state via JSON metadata files (`<db>-info`) and `DatabaseMetadata` structs.
- **WAL-based replication** streams protobuf-encoded frames with strict boundary validation and maintains a revert database for atomic rollback safety.
- **Language bindings** leverage the `SyncEngineIoQueue` abstraction to expose Rust core functionality to JavaScript, Python, and other environments without duplicating synchronization logic.

## Frequently Asked Questions

### How does Turso's sync engine handle conflict resolution?

The engine implements a **last-write-wins** strategy with deterministic conflict resolution. During the pull cycle in `apply_changes_from_remote()`, local uncommitted frames are rolled back before remote frames are applied. The **revert database** maintains a consistent pre-sync state via `checkpoint()` operations, allowing applications to roll back to a known good configuration if conflicts require manual intervention.

### What is the role of the revert database in Turso sync?

The revert database serves as a **shadow checkpoint** of the main database. During `checkpoint()` operations in [`database_sync_engine.rs`](https://github.com/tursodatabase/turso/blob/main/database_sync_engine.rs) (lines 876-917), pages newer than the last watermark are copied from the main WAL to the revert database. This creates a consistent snapshot that can be restored without corrupting the main database file, providing atomic rollback capabilities during synchronization failures.

### How does the I/O abstraction layer enable multi-language support?

The `SyncEngineIo` trait in [`sync/sdk-kit/src/sync_engine_io.rs`](https://github.com/tursodatabase/turso/blob/main/sync/sdk-kit/src/sync_engine_io.rs) defines a queue-based interface where requests (`SyncEngineIoRequest`) and completions (`SyncEngineIoCompletion`) are queued rather than executed directly. Language bindings in [`bindings/javascript/sync/src/lib.rs`](https://github.com/tursodatabase/turso/blob/main/bindings/javascript/sync/src/lib.rs) pop these queue items and execute them using the host environment's native networking and filesystem APIs, allowing the Rust core to remain platform-agnostic while supporting JavaScript, Python, and other languages.

### What protocol does Turso use to synchronize WAL frames?

Turso uses a **protobuf-based protocol** over HTTP/HTTPS. The engine downloads remote frames via endpoints like `/pull-updates` and uploads local changes via `/push-wal` and the HRANA `/batch` endpoint. Frame data is encoded as `PageData` protobuf messages in `wal_pull_to_file_v1()` (lines 293-376), with the protocol version (`Legacy` or `V1`) determined by `DatabaseSyncEngineOpts`.