Zvec Memory Footprint and Resource Management Strategy: A Complete Guide

Zvec controls memory usage through a three-tier strategy: global cgroup-aware soft limits (defaulting to 80% of available RAM), per-component quotas for index builders, and memory-mapped I/O with optional page locking, all implemented in the C++ core and exposed via Python's zvec.init().

Zvec is designed to run in-process while maintaining predictable memory consumption. Understanding zvec's memory footprint and resource management strategy is essential for deploying high-throughput vector search in containerized environments where cgroup limits apply. The system combines system-level caps, component-level quotas, and zero-copy I/O to keep RAM usage deterministic.

Global Memory Limits and Cgroup Awareness

Zvec derives its memory ceiling from the host environment rather than assuming unlimited RAM.

Soft Cap Calculation in GlobalConfig

At startup, GlobalConfig::Initialize() queries the cgroup memory limit via CgroupUtil::getMemoryLimit() and sets a soft cap defaulting to approximately 80% of that value. All internal components query GlobalConfig::memory_limit_bytes() to stay within this bound.

  • Source: src/db/common/config.cc (lines 33-36)
  • Source: src/db/common/cgroup_util.cc

Cgroup Detection Implementation

The CgroupUtil class reads /sys/fs/cgroup/memory/memory.limit_in_bytes (or the cgroup v2 equivalent) to detect Docker or Kubernetes constraints. This ensures zvec respects container boundaries rather than being killed by the OOM killer.

Per-Component Memory Quotas

Beyond global limits, individual algorithms enforce their own ceilings to prevent runaway allocation during index construction.

HNSW Builder Memory Constraints

The HNSW builder accepts a memory_quota parameter (key: proxima.hnsw.builder.memory_quota). During construction, it continuously checks projected usage via node_size() * docs + neighbors_size_ * docs. If the quota would be exceeded, the builder aborts early with an error.

IVF and Sparse Index Quotas

IVF and sparse index builders follow the same pattern, exposing a memory_quota parameter in their respective entity headers (src/core/algorithm/ivf/ivf_builder_entity.h). This allows fine-grained control over RAM usage during the computationally intensive clustering phase.

Zero-Copy I/O with Memory Mapping

Zvec minimizes its live heap by leveraging the operating system's virtual memory manager.

MMapFileStorage Configuration

Vector data, meta-segments, and forward stores can be mapped directly into the process address space via MMapFileStorage. This avoids extra heap buffers for large read-only datasets. The implementation parses configuration parameters to control mapping behavior.

  • Source: src/core/utility/mmap_file_storage.cc

Warmup and Page Locking Options

Two optional parameters control mmap behavior:

  • memory_warmup: When true, the mapping uses MAP_POPULATE to pre-fault pages into RAM, eliminating latency spikes on first access.
  • memory_locked: When true, pages are locked with mlock to prevent swapping.

These are defined in src/core/utility/utility_params.h and parsed in mmap_file_storage.cc (lines 60-64).

Thread Pool and Document Accounting

Resource management extends beyond RAM to execution contexts and per-document overhead.

GlobalResource Thread Management

A singleton GlobalResource lazily creates two thread pools—one for query execution and one for background index optimization—sized according to the cgroup CPU limit detected at startup. This prevents oversubscription in containerized environments.

Per-Document Memory Tracking

Every Doc object implements memory_usage(), which walks the variant-based field map and sums the capacity of every stored string, vector, and sub-structure. This accurate per-document estimate is used by the buffer-store and memory-forward store to enforce quotas.

  • Source: src/db/index/common/doc.cc (lines 12-22)

Configuring Zvec Memory in Python

The Python API exposes all native resource controls through the zvec.init() function.

import zvec

# Initialize with explicit memory constraints

zvec.init(
    path="./my_collection",
    memory_limit_mb=2048,          # Soft cap: 2 GiB

    query_thread_count=4,          # Override cgroup detection

    enable_mmap=True,              # Use memory-mapped files

    mmap_params={
        "memory_warmup": True,    # Pre-fault pages

        "memory_locked": False    # Do not mlock

    }
)

The memory_limit_mb parameter maps to GlobalConfig::memory_limit_bytes(), while enable_mmap and mmap_params configure the MMapFileStorage behavior described earlier.

Summary

  • Global soft limits: Zvec defaults to 80% of cgroup memory, queryable via GlobalConfig::memory_limit_bytes().
  • Component quotas: Index builders enforce hard memory_quota caps to prevent runaway allocation during training.
  • Zero-copy I/O: Memory-mapped storage with optional memory_warmup and memory_locked minimizes heap usage and latency.
  • Thread control: GlobalResource sizes thread pools based on cgroup CPU limits to prevent oversubscription.
  • Fine-grained accounting: Doc::memory_usage() provides accurate per-document memory estimates for store management.

Frequently Asked Questions

How does zvec handle memory limits in Docker containers?

Zvec reads the cgroup memory limit from /sys/fs/cgroup/memory/memory.limit_in_bytes via CgroupUtil::getMemoryLimit() and sets a soft cap at 80% of that value. This prevents the process from being killed by the OOM killer when running in Docker or Kubernetes environments.

What happens if an index builder exceeds its memory quota?

If an HNSW, IVF, or sparse index builder projects that its allocation will exceed the configured memory_quota parameter, it aborts early and returns an error. The builder continuously checks usage via formulas like node_size() * docs + neighbors_size_ * docs during construction.

Can I lock vector data in RAM to prevent swapping?

Yes. When configuring memory-mapped storage via zvec.init(), set mmap_params={"memory_locked": True}. This invokes mlock on the mapped pages via MMapFileStorage, keeping the data resident in physical memory and preventing swap-out.

How does zvec calculate memory usage for individual documents?

Each Doc object implements memory_usage(), which traverses the variant-based field map and sums the capacity of every stored string, vector, and nested structure. This per-document accounting is used by the buffer-store and memory-forward store to enforce segment-level quotas and prevent unbounded growth.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →