How zvec's Write-Ahead Log (WAL) Ensures Data Durability

zvec guarantees durability by appending every mutation to a crash-recoverable WAL with CRC checksums before applying changes to in-memory structures, ensuring no committed data is lost even after system crashes.

The Alibaba zvec vector database engine relies on a robust write-ahead log (WAL) mechanism to provide strong durability guarantees for every document mutation. Before any INSERT, UPDATE, UPSERT, or DELETE operation becomes visible to readers, zvec serializes the change to an append-only WAL file protected by CRC32C checksums. This design ensures that even if the system crashes immediately after a write, the committed data can be fully recovered during the next startup.

Core WAL Architecture and File Structure

WAL File Creation and Path Management

Every segment in zvec maintains its own dedicated WAL file. When a segment initializes its write-ahead log, SegmentImpl::open_wal_file() invokes WalFile::Create to instantiate a LocalWalFile. The file path is constructed via FileHelper::MakeWalPath, generating a file named *.wal in the segment's directory.

This separation ensures that WAL files are never overwritten in-place—only new records are appended—providing a linear history of mutations that can be replayed sequentially.

Record Format and CRC Protection

Each mutation is wrapped in a WalRecord structure before hitting the disk. In LocalWalFile::append, the system constructs a record containing three fields: length (payload size), CRC32C checksum, and payload (the serialized document bytes).

The CRC is computed using ailego::Crc32c::Hash over the payload data. This checksum serves as a corruption detector during recovery—if a bit-rot or partial write occurs, the CRC validation fails and the recovery process stops before applying the damaged record, preventing data loss.

Atomic Append and Thread Safety

Concurrent mutations to the same segment are serialized through a mutex guard. Inside LocalWalFile::append, the actual disk write occurs within a std::lock_guard<std::mutex> scope that protects the write_record method.

This locking ensures that multiple threads appending to the same WAL file cannot interleave their records, maintaining the sequential integrity required for deterministic crash recovery. Each append operation is atomic from the perspective of the WAL—either the entire record (length + CRC + payload) is written, or none of it is.

Flush Policies and OS-Level Persistence

Durability is configurable through WalOptions.max_docs_wal_flush. This parameter controls how many documents can be appended before the system forces an fsync operation via file_.flush().

When the append count reaches the configured threshold, LocalWalFile::append automatically triggers a flush, forcing the operating system to push buffered data to the storage device. Users can also explicitly call wal_file_->flush() for synchronous durability on critical writes. This turns in-memory buffers into durable storage, ensuring that committed transactions survive a power loss.

Crash Recovery and Replay Mechanism

Recovery Initialization

When a segment restarts after a crash, SegmentImpl::recover re-opens the existing WAL file via SegmentImpl::open_wal_file. If the file exists, the system prepares it for sequential reading using prepare_for_read, which validates the WalHeader (containing version information) to ensure format compatibility.

Sequential Replay

The recovery process iterates through the WAL using wal_file_->next(), which reads each WalRecord, verifies its CRC, and returns the payload. For each valid record, the system deserializes the document using Doc::deserialize and re-applies the operation (INSERT, UPDATE, UPSERT, or DELETE) to the in-memory structures exactly as during normal processing.

If a corrupted record is encountered—indicated by a CRC mismatch or incomplete read—the next() method returns an empty string, causing the replay loop to break. All preceding valid records have already been applied, ensuring that only complete, verified mutations are recovered.

WAL Cleanup

Once a segment has been fully persisted to immutable block files (vector and scalar data flushed to disk), the WAL is no longer needed. LocalWalFile::remove deletes the *.wal file, freeing storage space while maintaining the durable state in the compacted segment files.

Code Examples

Low-Level WAL Creation and Usage

#include "db/index/storage/wal/wal_file.h"
#include "db/common/file_helper.h"

using namespace zvec;

void low_level_wal_demo() {
    // Build a WAL file path: <collection_dir>/<seg_id>/<block_id>.wal
    std::string wal_path = FileHelper::MakeWalPath("./data", 0, 0);
    WalFilePtr wal = WalFile::Create(wal_path);

    WalOptions opt;
    opt.create_new = true;          // fresh WAL
    opt.max_docs_wal_flush = 100;   // flush after every 100 docs
    wal->open(opt);

    // Append a few dummy payloads
    for (int i = 0; i < 5; ++i) {
        std::string payload = "doc-" + std::to_string(i);
        wal->append(payload);      // atomic, CRC‑protected
    }
    wal->flush();   // force OS to persist data
    wal->close();   // safe to delete later
}

High-Level Document Insertion

#include "zvec/db/collection.h"

void insert_demo(zvec::Collection &coll) {
    // Build a document – the same object that will be serialized
    zvec::Doc doc;
    doc.set_pk("user_123");
    doc.set("age", int32_t(27));
    doc.set("vector", std::vector<float>{0.1f, 0.2f, 0.3f});

    // The Insert call will:
    //   1️⃣ Append the serialized Doc to the WAL
    //   2️⃣ Persist it in the forward store and scalar/vector indexes
    //   3️⃣ Flush according to the WAL's `max_docs_wal_flush` policy
    auto status = coll.Insert(doc);
    if (!status.ok()) {
        LOG_ERROR("Insert failed: %s", status.message().c_str());
    }
}

Simulating Crash Recovery

// 1️⃣ Write some docs
insert_demo(collection);   // WAL is appended and optionally flushed

// 2️⃣ Simulate a crash → process terminates, no explicit close
// (the OS may have flushed some data, the rest stays in the file buffer)

// 3️⃣ On next start, open the collection again:
//    – SegmentImpl::open_wal_file() re‑opens the same *.wal file
//    – SegmentImpl::recover() reads the WAL and re‑applies all records
// The collection will contain exactly the documents that were successfully
// appended before the crash, no partially written record will be applied
// because the CRC validation fails and `next()` returns an empty string.

Key Source Files

File Role GitHub Link
src/db/index/storage/wal/wal_file.h Abstract WAL interface (WalFile, WalOptions) wal_file.h
src/db/index/storage/wal/wal_file.cc Factory creating a LocalWalFile and high-level helpers wal_file.cc
src/db/index/storage/wal/local_wal_file.h Concrete on-disk WAL implementation (LocalWalFile) local_wal_file.h
src/db/index/storage/wal/local_wal_file.cc Record formatting, CRC, thread-safe append, read-back, flush, delete local_wal_file.cc
src/db/index/segment/segment.cc Segment core – calls append_wal, opens WAL, recovers from it segment.cc
src/db/common/file_helper.h Helper for constructing WAL file paths (MakeWalPath) and other storage files file_helper.h
tests/db/index/storage/wal_file_test.cc Unit tests that verify durability under various flush policies and multithreaded writes wal_file_test.cc

Summary

  • Append-only design: zvec's WAL in LocalWalFile never overwrites existing data, ensuring a linear history of mutations that can be replayed deterministically after crashes.
  • CRC32C protection: Every record carries a checksum computed via ailego::Crc32c::Hash, allowing the recovery process to detect corruption and halt replay before applying damaged data.
  • Mutex-serialized writes: Concurrent appends are protected by std::lock_guard<std::mutex> in LocalWalFile::append, preventing record interleaving and ensuring atomicity.
  • Configurable flush policies: The max_docs_wal_flush option in WalOptions controls how often file_.flush() forces OS buffers to disk, balancing durability guarantees against write throughput.
  • Automatic recovery: On segment startup, SegmentImpl::recover replays the WAL sequentially, reconstructing the exact pre-crash state by re-applying every valid record to in-memory structures.

Frequently Asked Questions

How does zvec prevent data loss if the system crashes during a write?

zvec prevents data loss by following a strict write-ahead protocol where every mutation is serialized to the WAL in LocalWalFile::append before modifying in-memory indexes. If a crash occurs, the recovery process in SegmentImpl::recover replays all valid WAL records on the next startup, restoring the exact state that existed before the failure. Corrupted or partial records are detected via CRC32C checksums and excluded from replay.

What is the role of CRC32C checksums in zvec's WAL?

CRC32C checksums serve as integrity guards against bit-rot and partial writes. When LocalWalFile::append writes a record, it computes a checksum using ailego::Crc32c::Hash and stores it alongside the payload length. During recovery, wal_file_->next() validates this CRC before returning the record; if validation fails, the replay loop terminates, ensuring that only verified data is applied to the segment.

How does zvec balance durability and performance in WAL flushing?

zvec exposes a tunable max_docs_wal_flush parameter in WalOptions that controls the flush frequency. After every N documents appended (where N equals max_docs_wal_flush), LocalWalFile::append automatically invokes file_.flush(), forcing the OS to persist buffered data to the storage device. Users requiring synchronous durability can set this to 1 for immediate flushing, while high-throughput scenarios can increase the threshold to amortize fsync costs.

Can multiple threads write to the same WAL concurrently?

Yes, multiple threads can safely append to the same WAL file concurrently. LocalWalFile::append protects the critical section with a std::lock_guard<std::mutex>, ensuring that record writes are serialized and that the length-prefix + CRC + payload structure remains contiguous and uncorrupted by concurrent access. This mutex-based serialization guarantees that the WAL remains append-only and thread-safe without requiring external synchronization from callers.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →