# How zvec's Write-Ahead Log (WAL) Ensures Data Durability

> Discover how zvec's write-ahead log (WAL) guarantees data durability. Learn how appending mutations with CRC checksums prevents data loss even after system crashes.

- Repository: [Alibaba/zvec](https://github.com/alibaba/zvec)
- Tags: internals
- Published: 2026-02-16

---

**zvec guarantees durability by appending every mutation to a crash-recoverable WAL with CRC checksums before applying changes to in-memory structures, ensuring no committed data is lost even after system crashes.**

The Alibaba zvec vector database engine relies on a robust write-ahead log (WAL) mechanism to provide strong durability guarantees for every document mutation. Before any INSERT, UPDATE, UPSERT, or DELETE operation becomes visible to readers, zvec serializes the change to an append-only WAL file protected by CRC32C checksums. This design ensures that even if the system crashes immediately after a write, the committed data can be fully recovered during the next startup.

## Core WAL Architecture and File Structure

### WAL File Creation and Path Management

Every segment in zvec maintains its own dedicated WAL file. When a segment initializes its write-ahead log, `SegmentImpl::open_wal_file()` invokes `WalFile::Create` to instantiate a `LocalWalFile`. The file path is constructed via `FileHelper::MakeWalPath`, generating a file named `*.wal` in the segment's directory.

This separation ensures that WAL files are never overwritten in-place—only new records are appended—providing a linear history of mutations that can be replayed sequentially.

### Record Format and CRC Protection

Each mutation is wrapped in a `WalRecord` structure before hitting the disk. In `LocalWalFile::append`, the system constructs a record containing three fields: **length** (payload size), **CRC32C checksum**, and **payload** (the serialized document bytes).

The CRC is computed using `ailego::Crc32c::Hash` over the payload data. This checksum serves as a corruption detector during recovery—if a bit-rot or partial write occurs, the CRC validation fails and the recovery process stops before applying the damaged record, preventing data loss.

## Atomic Append and Thread Safety

Concurrent mutations to the same segment are serialized through a mutex guard. Inside `LocalWalFile::append`, the actual disk write occurs within a `std::lock_guard<std::mutex>` scope that protects the `write_record` method.

This locking ensures that multiple threads appending to the same WAL file cannot interleave their records, maintaining the sequential integrity required for deterministic crash recovery. Each append operation is atomic from the perspective of the WAL—either the entire record (length + CRC + payload) is written, or none of it is.

## Flush Policies and OS-Level Persistence

Durability is configurable through `WalOptions.max_docs_wal_flush`. This parameter controls how many documents can be appended before the system forces an `fsync` operation via `file_.flush()`.

When the append count reaches the configured threshold, `LocalWalFile::append` automatically triggers a flush, forcing the operating system to push buffered data to the storage device. Users can also explicitly call `wal_file_->flush()` for synchronous durability on critical writes. This turns in-memory buffers into durable storage, ensuring that committed transactions survive a power loss.

## Crash Recovery and Replay Mechanism

### Recovery Initialization

When a segment restarts after a crash, `SegmentImpl::recover` re-opens the existing WAL file via `SegmentImpl::open_wal_file`. If the file exists, the system prepares it for sequential reading using `prepare_for_read`, which validates the `WalHeader` (containing version information) to ensure format compatibility.

### Sequential Replay

The recovery process iterates through the WAL using `wal_file_->next()`, which reads each `WalRecord`, verifies its CRC, and returns the payload. For each valid record, the system deserializes the document using `Doc::deserialize` and re-applies the operation (INSERT, UPDATE, UPSERT, or DELETE) to the in-memory structures exactly as during normal processing.

If a corrupted record is encountered—indicated by a CRC mismatch or incomplete read—the `next()` method returns an empty string, causing the replay loop to break. All preceding valid records have already been applied, ensuring that only complete, verified mutations are recovered.

### WAL Cleanup

Once a segment has been fully persisted to immutable block files (vector and scalar data flushed to disk), the WAL is no longer needed. `LocalWalFile::remove` deletes the `*.wal` file, freeing storage space while maintaining the durable state in the compacted segment files.

## Code Examples

### Low-Level WAL Creation and Usage

```cpp
#include "db/index/storage/wal/wal_file.h"
#include "db/common/file_helper.h"

using namespace zvec;

void low_level_wal_demo() {
    // Build a WAL file path: <collection_dir>/<seg_id>/<block_id>.wal
    std::string wal_path = FileHelper::MakeWalPath("./data", 0, 0);
    WalFilePtr wal = WalFile::Create(wal_path);

    WalOptions opt;
    opt.create_new = true;          // fresh WAL
    opt.max_docs_wal_flush = 100;   // flush after every 100 docs
    wal->open(opt);

    // Append a few dummy payloads
    for (int i = 0; i < 5; ++i) {
        std::string payload = "doc-" + std::to_string(i);
        wal->append(payload);      // atomic, CRC‑protected
    }
    wal->flush();   // force OS to persist data
    wal->close();   // safe to delete later
}

```

### High-Level Document Insertion

```cpp
#include "zvec/db/collection.h"

void insert_demo(zvec::Collection &coll) {
    // Build a document – the same object that will be serialized
    zvec::Doc doc;
    doc.set_pk("user_123");
    doc.set("age", int32_t(27));
    doc.set("vector", std::vector<float>{0.1f, 0.2f, 0.3f});

    // The Insert call will:
    //   1️⃣ Append the serialized Doc to the WAL
    //   2️⃣ Persist it in the forward store and scalar/vector indexes
    //   3️⃣ Flush according to the WAL's `max_docs_wal_flush` policy
    auto status = coll.Insert(doc);
    if (!status.ok()) {
        LOG_ERROR("Insert failed: %s", status.message().c_str());
    }
}

```

### Simulating Crash Recovery

```cpp
// 1️⃣ Write some docs
insert_demo(collection);   // WAL is appended and optionally flushed

// 2️⃣ Simulate a crash → process terminates, no explicit close
// (the OS may have flushed some data, the rest stays in the file buffer)

// 3️⃣ On next start, open the collection again:
//    – SegmentImpl::open_wal_file() re‑opens the same *.wal file
//    – SegmentImpl::recover() reads the WAL and re‑applies all records
// The collection will contain exactly the documents that were successfully
// appended before the crash, no partially written record will be applied
// because the CRC validation fails and `next()` returns an empty string.

```

## Key Source Files

| File | Role | GitHub Link |
|------|------|-------------|
| [`src/db/index/storage/wal/wal_file.h`](https://github.com/alibaba/zvec/blob/main/src/db/index/storage/wal/wal_file.h) | Abstract WAL interface (`WalFile`, `WalOptions`) | [wal_file.h](https://github.com/alibaba/zvec/blob/main/src/db/index/storage/wal/wal_file.h) |
| `src/db/index/storage/wal/wal_file.cc` | Factory creating a `LocalWalFile` and high-level helpers | [wal_file.cc](https://github.com/alibaba/zvec/blob/main/src/db/index/storage/wal/wal_file.cc) |
| [`src/db/index/storage/wal/local_wal_file.h`](https://github.com/alibaba/zvec/blob/main/src/db/index/storage/wal/local_wal_file.h) | Concrete on-disk WAL implementation (`LocalWalFile`) | [local_wal_file.h](https://github.com/alibaba/zvec/blob/main/src/db/index/storage/wal/local_wal_file.h) |
| `src/db/index/storage/wal/local_wal_file.cc` | Record formatting, CRC, thread-safe append, read-back, flush, delete | [local_wal_file.cc](https://github.com/alibaba/zvec/blob/main/src/db/index/storage/wal/local_wal_file.cc) |
| `src/db/index/segment/segment.cc` | Segment core – calls `append_wal`, opens WAL, recovers from it | [segment.cc](https://github.com/alibaba/zvec/blob/main/src/db/index/segment/segment.cc) |
| [`src/db/common/file_helper.h`](https://github.com/alibaba/zvec/blob/main/src/db/common/file_helper.h) | Helper for constructing WAL file paths (`MakeWalPath`) and other storage files | [file_helper.h](https://github.com/alibaba/zvec/blob/main/src/db/common/file_helper.h) |
| `tests/db/index/storage/wal_file_test.cc` | Unit tests that verify durability under various flush policies and multithreaded writes | [wal_file_test.cc](https://github.com/alibaba/zvec/blob/main/tests/db/index/storage/wal_file_test.cc) |

## Summary

- **Append-only design**: zvec's WAL in `LocalWalFile` never overwrites existing data, ensuring a linear history of mutations that can be replayed deterministically after crashes.
- **CRC32C protection**: Every record carries a checksum computed via `ailego::Crc32c::Hash`, allowing the recovery process to detect corruption and halt replay before applying damaged data.
- **Mutex-serialized writes**: Concurrent appends are protected by `std::lock_guard<std::mutex>` in `LocalWalFile::append`, preventing record interleaving and ensuring atomicity.
- **Configurable flush policies**: The `max_docs_wal_flush` option in `WalOptions` controls how often `file_.flush()` forces OS buffers to disk, balancing durability guarantees against write throughput.
- **Automatic recovery**: On segment startup, `SegmentImpl::recover` replays the WAL sequentially, reconstructing the exact pre-crash state by re-applying every valid record to in-memory structures.

## Frequently Asked Questions

### How does zvec prevent data loss if the system crashes during a write?

zvec prevents data loss by following a strict write-ahead protocol where every mutation is serialized to the WAL in `LocalWalFile::append` before modifying in-memory indexes. If a crash occurs, the recovery process in `SegmentImpl::recover` replays all valid WAL records on the next startup, restoring the exact state that existed before the failure. Corrupted or partial records are detected via CRC32C checksums and excluded from replay.

### What is the role of CRC32C checksums in zvec's WAL?

CRC32C checksums serve as integrity guards against bit-rot and partial writes. When `LocalWalFile::append` writes a record, it computes a checksum using `ailego::Crc32c::Hash` and stores it alongside the payload length. During recovery, `wal_file_->next()` validates this CRC before returning the record; if validation fails, the replay loop terminates, ensuring that only verified data is applied to the segment.

### How does zvec balance durability and performance in WAL flushing?

zvec exposes a tunable `max_docs_wal_flush` parameter in `WalOptions` that controls the flush frequency. After every N documents appended (where N equals `max_docs_wal_flush`), `LocalWalFile::append` automatically invokes `file_.flush()`, forcing the OS to persist buffered data to the storage device. Users requiring synchronous durability can set this to 1 for immediate flushing, while high-throughput scenarios can increase the threshold to amortize fsync costs.

### Can multiple threads write to the same WAL concurrently?

Yes, multiple threads can safely append to the same WAL file concurrently. `LocalWalFile::append` protects the critical section with a `std::lock_guard<std::mutex>`, ensuring that record writes are serialized and that the length-prefix + CRC + payload structure remains contiguous and uncorrupted by concurrent access. This mutex-based serialization guarantees that the WAL remains append-only and thread-safe without requiring external synchronization from callers.