How Turso's WAL (Write-Ahead Log) Implementation Works: A Deep Dive into SQLite-Compatible Storage
Turso implements a fully-featured SQLite-compatible Write-Ahead Log using a WalFile struct that coordinates durable writes, crash recovery, and checkpointing through vectored I/O, atomic read-marks, and a custom 64-bit TursoRwLock primitive.
The primary source code for Turso's WAL implementation lives in the tursodatabase/turso repository, specifically within core/storage/wal.rs. This implementation provides the durability guarantees required for SQLite-compatible databases while managing concurrent access across multiple connections through a sophisticated coordination layer.
Core Architecture and Data Structures
Turso's WAL architecture centers on the WalFile struct, which implements the Wal trait and orchestrates all write-ahead logging operations. The design separates shared state from coordination logic, enabling both single-process and future distributed configurations.
WalFileShared and State Management
The WalFileShared struct (core/storage/wal.rs:2712) maintains the in-process state including the maximum frame number, backfill progress, rolling checksums, and lock states. This structure is accessed by all connections within a process and contains the authoritative metadata for the WAL file.
Coordination Abstractions
The WalCoordination trait (core/storage/wal.rs:664) abstracts the backend responsible for inter-process or intra-process coordination. It provides methods for atomic snapshot publishing, lock acquisition, and frame caching. The default implementation, InProcessWalCoordination (core/storage/wal.rs:889), handles single-process coordination using the TursoRwLock primitive.
The TursoRwLock Primitive
Concurrency control relies on TursoRwLock (core/storage/wal.rs:742-L774), a 64-bit atomic lock that packs a 32-bit value, reader count, and writer bit into a single cache line. This design minimizes cache coherency traffic during high-concurrency read workloads while providing exclusive writer access.
Write Transaction Lifecycle
Beginning a write transaction involves acquiring exclusive locks and potentially resetting the WAL header when the log has been fully checkpointed.
Acquiring the Writer Lock
The begin_write_tx method (core/storage/wal.rs:1080-L1095) initiates the transaction by calling try_begin_write_tx on the coordination backend. This acquires the exclusive writer lock and ensures no other process is currently checkpointing or truncating the log.
Automatic WAL Restart
If the WAL has been completely back-filled (checkpoint has moved every frame to the database file), the implementation calls try_restart_log_before_write. This operation, handled by WalCoordination::try_restart_log_for_write (core/storage/wal.rs:1442-L1455), upgrades the read-mark-0 lock, rewrites the header with new salts, and clears the initialized flag. The WalAutoActions bitflags (core/storage/wal.rs:22-L33) control whether automatic checkpointing and restart occur during transaction begin.
Frame Preparation and Durable Writes
During an active transaction, database pages are serialized into WAL frames, checksummed, and written using vectored I/O for efficiency.
Preparing Frame Metadata
The prepare_frames method (core/storage/wal.rs:334-L354) serializes each page into a WAL frame, computes the rolling checksum using the SQLite checksum algorithm, and records per-frame metadata including page references, frame IDs, and cumulative checksums in a PreparedFrames struct.
Vectored I/O and Commit
Actual disk I/O occurs in append_frames_vectored, which performs a pwritev system call to write all buffers in a single operation, followed by an optional fsync via prepare_wal_finish. After the durable write completes, commit_prepared_frames updates the in-memory metadata (max frame, checksum, transaction count) and publishes the new commit via WalCoordination::publish_commit (core/storage/wal.rs:892-L904).
use turso::storage::wal::{Wal, WalAutoActions, CheckpointMode};
fn write_page(wal: &impl Wal, page: &[u8], page_no: u64, db_sz: u64) -> anyhow::Result<()> {
// 1) start a write transaction
wal.begin_write_tx(WalAutoActions::all_enabled())?;
// 2) prepare a single frame
let prepared = wal.prepare_frames(
&[PageRef::new(page_no)], // pages to write
PageSize::from_bytes(page.len() as u32),
Some(db_sz as u32), // size after commit
None,
)?;
// 3) write the frame to the WAL
wal.append_frames_vectored(vec![PageRef::new(page_no)], PageSize::from_bytes(page.len() as u32))?;
// 4) make the write visible to readers
wal.commit_prepared_frames(&[prepared]);
// 5) optionally sync the WAL (fsync)
wal.sync(FileSyncType::Normal)?;
// 6) end the transaction
wal.end_write_tx();
// 7) checkpoint automatically if enabled
if wal.should_checkpoint() {
wal.checkpoint(&pager, CheckpointMode::Full)?;
}
Ok(())
}
Checkpointing and Log Truncation
Checkpointing copies committed WAL frames into the main database file, advances the backfill counter, and optionally truncates the WAL to reclaim disk space.
Determining Safe Checkpoint Boundaries
The checkpoint routine acquires the checkpoint lock via acquire_checkpoint_guard and a read-mark-0 lock to guarantee exclusive access. It calculates the highest safe frame to checkpoint using determine_max_safe_checkpoint_frame, which examines all active read-marks to ensure no reader needs the frames being moved.
Back-Filling and Truncation
Frames are read in batches via read_frames_batch, written to the database file, and the shared nbackfills counter is advanced through publish_backfill. When CheckpointMode::Truncate is specified (core/storage/wal.rs:1150-L1175), the implementation calls truncate_wal (core/storage/wal.rs:1441-L1450) to zero-length the file after all frames are back-filled, then reinitializes the header via prepare_wal_header.
fn truncate_wal(wal: &impl Wal, pager: &Pager) -> anyhow::Result<()> {
// Run a TRUNCATE checkpoint – copies all frames and then zero-lengths the WAL.
let result = wal.checkpoint(pager, CheckpointMode::Truncate { upper_bound_inclusive: None })?;
// The returned `CheckpointResult` already contains the WAL-truncate state.
// If needed, we can explicitly wait for the truncate I/O:
wal.truncate_wal(&mut result.clone(), FileSyncType::Normal)?;
Ok(())
}
Concurrency Control and Locking
Turso's WAL implements a multi-version concurrency control scheme using read-marks and hierarchical locking to allow concurrent readers during writes.
Read-Mark Acquisition
Read transactions acquire one of five available read-marks that record the frame number visible to that reader. The lock-free read-mark value is stored within a TursoRwLock. The try_begin_read_tx method selects the best available read-mark or falls back to a database-file lock for legacy compatibility.
Writer and Checkpoint Coordination
- Write transactions hold the exclusive writer lock via
try_begin_write_txand maintain read-mark-0 to block checkpoints during header restarts. - Checkpoint operations obtain an additional checkpoint lock (
try_checkpoint_lock) ensuring only one checkpoint runs globally, preventing race conditions during database file updates.
All locking primitives are implemented in InProcessWalCoordination and wired into the Wal trait implementation.
Summary
- Turso's WAL implementation resides primarily in
core/storage/wal.rsand provides SQLite-compatible durability through theWalFilestruct. - Core coordination uses the
WalCoordinationtrait withInProcessWalCoordinationas the default single-process backend, utilizing the 64-bitTursoRwLockfor fast synchronization. - Write transactions proceed through
begin_write_tx,prepare_frames,append_frames_vectored, andcommit_prepared_frames, with automatic WAL restart when fully checkpointed. - Checkpointing safely copies frames to the database file using
determine_max_safe_checkpoint_frameto respect active readers, supporting bothFullandTruncatemodes. - Concurrency control employs five read-marks, exclusive writer locks, and a global checkpoint lock to allow parallel reads during active writes.
Frequently Asked Questions
How does Turso's WAL differ from standard SQLite WAL mode?
Turso's implementation maintains full protocol compatibility with SQLite's WAL format, allowing interchangeability of database files. However, Turso replaces SQLite's default POSIX advisory locking with the TursoRwLock primitive and the WalCoordination trait abstraction, enabling future distributed coordination while maintaining single-process performance. The core logic for frame checksums and header formats remains compatible as defined in core/storage/sqlite3_ondisk.rs.
What happens if a crash occurs during a write transaction?
If a crash occurs after append_frames_vectored but before commit_prepared_frames, the frames remain in the WAL file but are not visible to readers because the shared metadata (max frame counter) was never updated. On reopening, Turso scans the WAL during recovery, validates checksums, and only considers frames up to the last committed transaction. The WalFileShared structure maintains the authoritative commit state separate from the disk buffers.
When should I use CheckpointMode::Truncate versus CheckpointMode::Full?
Use CheckpointMode::Full when you want to back-fill frames into the database file while preserving the WAL file for subsequent writes, minimizing filesystem metadata operations. Use CheckpointMode::Truncate (core/storage/wal.rs:1150-L1175) when disk space is constrained or before closing the database connection, as it zero-lengths the WAL file after back-filling. Truncation requires exclusive access to read-mark-0 and prevents new write transactions until completion.
How does the TursoRwLock achieve better performance than standard mutexes?
The TursoRwLock (core/storage/wal.rs:742-L774) packs all synchronization state (32-bit value, reader count, writer bit) into a single 64-bit atomic variable that fits in one cache line. This eliminates cache line bouncing between CPU cores during high-concurrency read workloads, as readers only perform atomic loads and increments on the same cache line rather than contending for separate mutex structures. The design favors reader scalability while maintaining writer exclusivity through atomic bit manipulation.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →