How Turso’s Page Cache and Pager System Optimize Memory Usage
Turso optimizes memory usage through a bounded PageCache implementing the SIEVE eviction algorithm, coupled with a Pager that coordinates dirty-page tracking, asynchronous WAL spills, and buffer pooling to keep the working set small and predictable.
Turso, the open-source edge database maintained at tursodatabase/turso, manages SQLite-compatible database pages through a specialized two-layer architecture. The PageCache serves as a fixed-capacity, self-evicting in-memory store, while the Pager acts as the coordination layer between the cache, disk I/O, and the Write-Ahead Log (WAL). This design ensures that only frequently accessed data occupies RAM, while stale pages are evicted and dirty data is spilled to persistent storage before memory pressure becomes critical.
The PageCache: Bounded Storage with SIEVE Eviction
The PageCache in core/storage/page_cache.rs provides a bounded, thread-safe storage layer that automatically manages its own size. Unlike unbounded caches that risk uncontrolled memory growth, Turso’s implementation uses the SIEVE (clock-hand) eviction algorithm to maintain a strict page-count limit while maximizing hit rates for hot data.
Fixed Capacity and Configuration
The cache initializes with a hard page limit defined by DEFAULT_PAGE_CACHE_SIZE_IN_PAGES. In core/storage/page_cache.rs (lines 14-18), the default constructor creates a cache with this fixed capacity:
pub fn default() -> Self {
PageCache::new(DEFAULT_PAGE_CACHE_SIZE_IN_PAGES)
}
For WebAssembly builds, this defaults to a smaller value appropriate for browser environments, while native builds use a larger default (typically a few thousand pages, or roughly 8 MiB assuming 4 KiB pages). This bounded approach ensures the database never consumes more memory than explicitly configured, regardless of workload size.
SIEVE Clock-Hand Eviction Algorithm
Turso implements the SIEVE algorithm rather than traditional LRU to achieve better concurrency characteristics and simpler eviction logic. Each cache entry maintains a 3-bit ref_bit used by the clock hand.
The eviction logic resides in evict_one (lines 1028-1060 in core/storage/page_cache.rs). The algorithm maintains a circular intrusive linked list (intrusive_collections::LinkedList) where a "clock hand" sweeps continuously. When the hand visits an entry, it decrements the ref_bit. Only entries whose ref_bit equals CLEAR and are marked evictable are removed. On insertion, new entries are placed after the clock hand, making them Most-Recently-Used (MRU), which protects them from immediate eviction.
Evictable Page Tracking
To avoid scanning the entire map when checking if eviction is possible, the cache maintains an evictable_count counter. This lightweight atomic counter updates during critical state transitions:
- Insertion: Incremented in
_insert(lines 274-277) - Deletion: Decrement in
_delete(lines 330-334) - Dirty transitions: Modified in
notify_page_dirtyandnotify_page_spilled(lines 510-525)
When evictable_count reaches zero, the cache knows it cannot free space without spilling dirty pages first, triggering coordination with the Pager.
Spill Threshold Management
The cache supports a configurable spill threshold (DEFAULT_SPILL_THRESHOLD_PERCENT). When the ratio of dirty pages to total capacity exceeds this percentage, needs_spill() (lines 2124-2145) returns true, signaling the Pager to flush data. The method collect_spillable_pages (lines 4445-4470) then gathers dirty pages for asynchronous writing to the WAL, allowing the cache to reclaim memory without blocking foreground operations.
The Pager: Orchestrating Cache, I/O, and Persistence
The Pager in core/storage/pager.rs serves as the primary interface between the database engine and storage. It manages the lifecycle of Page objects, coordinates disk I/O, and ensures that dirty data eventually reaches the WAL while optimizing for concurrent access.
Non-Blocking Page Retrieval and Read Deduplication
When the engine requests a page, Pager::read_page_nonblock first queries the PageCache. On a cache miss, rather than immediately blocking, the pager checks pending_reads (line 636), a RwLock-protected map that tracks in-flight asynchronous read operations.
As noted in the source comments (lines 635-640), this mechanism allows multiple statements requesting the same uncached page to share a single I/O operation. The first caller initiates the read; subsequent callers await the same future, eliminating redundant disk seeks under concurrent load.
Dirty-Page Tracking via Bitmap
Durability requires precise tracking of modified pages. When application code calls Page::set_dirty (lines 998-1003 in pager.rs), the pager increments a RoaringBitmap named dirty_pages (line 1444). This compressed bitmap stores page numbers in sorted order, matching SQLite’s WAL expectations and enabling efficient sequential writes during commit operations.
Asynchronous Spill Coordination
The Pager owns a SpillState (line 1269) that runs an asynchronous state machine for cache flushing. When PageCache::needs_spill returns true, the pager schedules a spill operation:
- Collect candidate pages via
collect_spillable_pages - Write them to the WAL asynchronously
- Mark each spilled page via
Page::set_spilled, transitioning them to evictable status
This pipeline ensures that dirty data is persisted while freeing cache capacity for new pages, all without blocking SQL execution threads.
Buffer Pool Reuse
To minimize heap allocations during I/O operations, the Pager maintains a BufferPool (line 632 in pager.rs). Temporary buffers for reading large B-tree cells or other transient data are sourced from this pool rather than allocated per-operation. The pool reuses fixed-size buffers, keeping garbage collection pressure low during high-throughput workloads.
Page Pinning for Cursor Stability
Cursor operations require stable page references even under cache pressure. The Pager implements pinning to prevent eviction of actively used pages. When a cursor accesses a page, it calls Page::pin (lines 666-669), incrementing a reference counter. The evictable function (line 2020) checks this counter before allowing eviction. Only when unpin is called does the page become eligible for the SIEVE algorithm again, guaranteeing that long-running scans do not encounter unexpected page faults.
The Complete Memory Optimization Flow
The interaction between PageCache and Pager follows a strict lifecycle to maintain memory bounds:
-
Access: The Pager requests a page via
PageCache::get. On hit, the cache bumps the reference bit (entry.bump_refon line 889) and returns aPageRef. On miss, the Pager initiates an async read tracked inpending_reads. -
Eviction: When insertion requires space,
make_room_fortriggersevict_one. The SIEVE clock hand sweeps, giving second chances to recently-accessed pages before evicting clean or spilled entries. -
Dirty Page Handling: Modified pages call
set_dirty, which clears the spilled flag and notifies the cache vianotify_page_dirty, decrementingevictable_count. These pages cannot be evicted until persisted. -
Spilling: When dirty pages exceed the spill threshold, the Pager’s
CacheFlushStatewrites them to the WAL, then marks them spilled viaset_spilled. This incrementsevictable_count, making them eligible for eviction. -
Checkpoint: After sufficient WAL data accumulates, the Pager triggers a checkpoint to move data back to the main database file, freeing WAL space and allowing cache entries to be reclaimed or re-used.
Practical Implementation Examples
Creating a Pager with a Custom-Sized Cache
Configure the cache size during Pager initialization to control memory footprints for specific workloads:
use std::sync::Arc;
use turso::storage::pager::Pager;
use turso::storage::buffer_pool::BufferPool;
// 1 MiB cache → 256 pages (4 KiB each)
let cache = Arc::new(turso::storage::page_cache::PageCache::new(256));
let pager = Pager {
db_file: Arc::new(db_storage),
wal: Some(Arc::new(wal)),
page_cache: Arc::new(turso::sync::RwLock::new(cache)),
buffer_pool: Arc::new(BufferPool::new()),
// ... remaining fields
};
Handling Cache Hits and Misses
Detect cache residency to optimize application logic:
let key = turso::storage::page_cache::PageCacheKey::new(42);
let mut cache_guard = pager.page_cache.write();
match cache_guard.get(&key) {
Ok(Some(page)) => {
// Reference bit bumped automatically in get()
process_cached_page(page);
}
_ => {
// Schedules async read, deduplicates concurrent requests
let page = pager.read_page_nonblock(42).await?;
process_fresh_page(page);
}
}
Managing Dirty Pages and Spills
Explicitly mark modifications to trigger the spill pipeline:
let page = cache_guard.get(&key)?.unwrap();
page.set_dirty(); // Marks PAGE_DIRTY
pager.notify_page_dirty(key); // Updates evictable_count
// When threshold exceeded, pager automatically:
// 1. Calls cache.collect_spillable_pages(max_pages)
// 2. Writes to WAL
// 3. Calls page.set_spilled() to mark evictable
Pinning Pages During Iteration
Prevent eviction during cursor operations:
let cursor = pager.open_cursor(root_page)?;
let page = cursor.current_page()?;
page.pin(); // Increment pin counter
// Safe to use page across await points or long computations
traverse_tree(&page)?;
page.unpin(); // Now eligible for eviction
Summary
- Bounded Cache: The
PageCacheenforces strict memory limits viaDEFAULT_PAGE_CACHE_SIZE_IN_PAGES, preventing unbounded growth. - SIEVE Eviction: Uses a clock-hand algorithm with 3-bit reference counters for efficient O(1) eviction decisions.
- Async Spilling: The Pager coordinates dirty-page writes to the WAL through
SpillState, converting dirty pages to evictable status without blocking queries. - Read Deduplication:
pending_readseliminates duplicate I/O for concurrent page requests. - Resource Reuse:
BufferPooland pinning mechanisms minimize allocations and ensure cursor stability under memory pressure.
Frequently Asked Questions
How does Turso’s page cache differ from a standard LRU implementation?
Turso uses the SIEVE algorithm rather than traditional LRU. According to the source code in core/storage/page_cache.rs (lines 1028-1060), SIEVE maintains a circular clock hand that sweeps entries, decrementing 3-bit reference counters. This approach requires less metadata per entry than LRU timestamps and handles concurrent access patterns more efficiently, while the intrusive linked list design minimizes pointer chasing during eviction.
What happens when the page cache reaches its configured capacity?
When full, the cache invokes make_room_for, which triggers evict_one. The SIEVE algorithm scans the clock hand until finding an entry with a cleared reference bit that is not dirty and not pinned. If evictable_count is zero (meaning all pages are dirty), the Pager’s needs_spill logic initiates an asynchronous flush to the WAL, converting dirty pages to spilled status so they can be evicted.
How does the Pager prevent duplicate disk I/O when multiple queries request the same uncached page?
The Pager maintains a RwLock-protected FxHashMap called pending_reads (lines 635-640 in pager.rs). When read_page_nonblock encounters a cache miss, it checks this map. If an async read is already in-flight for that page number, the function returns the existing future rather than issuing a new I/O request. This ensures that under concurrent load, the disk sees only one physical read per page.
Why must dirty pages be spilled to the WAL before they can be evicted from the cache?
Durability guarantees require that committed data survive process crashes. The PageCache cannot evict dirty pages (those modified since the last flush) because they contain uncommitted changes. The Pager must first write them to the WAL via the asynchronous spill pipeline, then call set_spilled to mark them clean. This transition updates the evictable_count (lines 510-525 in page_cache.rs), allowing the SIEVE algorithm to reclaim that memory safely.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →