Zvec Memory Footprint and Resource Management Strategy: A Complete Guide
Zvec controls memory usage through a three-tier strategy: global cgroup-aware soft limits (defaulting to 80% of available RAM), per-component quotas for index builders, and memory-mapped I/O with optional page locking, all implemented in the C++ core and exposed via Python's zvec.init().
Zvec is designed to run in-process while maintaining predictable memory consumption. Understanding zvec's memory footprint and resource management strategy is essential for deploying high-throughput vector search in containerized environments where cgroup limits apply. The system combines system-level caps, component-level quotas, and zero-copy I/O to keep RAM usage deterministic.
Global Memory Limits and Cgroup Awareness
Zvec derives its memory ceiling from the host environment rather than assuming unlimited RAM.
Soft Cap Calculation in GlobalConfig
At startup, GlobalConfig::Initialize() queries the cgroup memory limit via CgroupUtil::getMemoryLimit() and sets a soft cap defaulting to approximately 80% of that value. All internal components query GlobalConfig::memory_limit_bytes() to stay within this bound.
- Source:
src/db/common/config.cc(lines 33-36) - Source:
src/db/common/cgroup_util.cc
Cgroup Detection Implementation
The CgroupUtil class reads /sys/fs/cgroup/memory/memory.limit_in_bytes (or the cgroup v2 equivalent) to detect Docker or Kubernetes constraints. This ensures zvec respects container boundaries rather than being killed by the OOM killer.
Per-Component Memory Quotas
Beyond global limits, individual algorithms enforce their own ceilings to prevent runaway allocation during index construction.
HNSW Builder Memory Constraints
The HNSW builder accepts a memory_quota parameter (key: proxima.hnsw.builder.memory_quota). During construction, it continuously checks projected usage via node_size() * docs + neighbors_size_ * docs. If the quota would be exceeded, the builder aborts early with an error.
- Source:
src/core/algorithm/hnsw/hnsw_builder_entity.h(lines 94-95) - Source:
src/core/algorithm/hnsw/hnsw_builder_entity.cc(lines 67-71)
IVF and Sparse Index Quotas
IVF and sparse index builders follow the same pattern, exposing a memory_quota parameter in their respective entity headers (src/core/algorithm/ivf/ivf_builder_entity.h). This allows fine-grained control over RAM usage during the computationally intensive clustering phase.
Zero-Copy I/O with Memory Mapping
Zvec minimizes its live heap by leveraging the operating system's virtual memory manager.
MMapFileStorage Configuration
Vector data, meta-segments, and forward stores can be mapped directly into the process address space via MMapFileStorage. This avoids extra heap buffers for large read-only datasets. The implementation parses configuration parameters to control mapping behavior.
- Source:
src/core/utility/mmap_file_storage.cc
Warmup and Page Locking Options
Two optional parameters control mmap behavior:
memory_warmup: When true, the mapping usesMAP_POPULATEto pre-fault pages into RAM, eliminating latency spikes on first access.memory_locked: When true, pages are locked withmlockto prevent swapping.
These are defined in src/core/utility/utility_params.h and parsed in mmap_file_storage.cc (lines 60-64).
Thread Pool and Document Accounting
Resource management extends beyond RAM to execution contexts and per-document overhead.
GlobalResource Thread Management
A singleton GlobalResource lazily creates two thread pools—one for query execution and one for background index optimization—sized according to the cgroup CPU limit detected at startup. This prevents oversubscription in containerized environments.
- Source:
src/db/common/global_resource.handsrc/db/common/global_resource.cc
Per-Document Memory Tracking
Every Doc object implements memory_usage(), which walks the variant-based field map and sums the capacity of every stored string, vector, and sub-structure. This accurate per-document estimate is used by the buffer-store and memory-forward store to enforce quotas.
- Source:
src/db/index/common/doc.cc(lines 12-22)
Configuring Zvec Memory in Python
The Python API exposes all native resource controls through the zvec.init() function.
import zvec
# Initialize with explicit memory constraints
zvec.init(
path="./my_collection",
memory_limit_mb=2048, # Soft cap: 2 GiB
query_thread_count=4, # Override cgroup detection
enable_mmap=True, # Use memory-mapped files
mmap_params={
"memory_warmup": True, # Pre-fault pages
"memory_locked": False # Do not mlock
}
)
The memory_limit_mb parameter maps to GlobalConfig::memory_limit_bytes(), while enable_mmap and mmap_params configure the MMapFileStorage behavior described earlier.
Summary
- Global soft limits: Zvec defaults to 80% of cgroup memory, queryable via
GlobalConfig::memory_limit_bytes(). - Component quotas: Index builders enforce hard
memory_quotacaps to prevent runaway allocation during training. - Zero-copy I/O: Memory-mapped storage with optional
memory_warmupandmemory_lockedminimizes heap usage and latency. - Thread control:
GlobalResourcesizes thread pools based on cgroup CPU limits to prevent oversubscription. - Fine-grained accounting:
Doc::memory_usage()provides accurate per-document memory estimates for store management.
Frequently Asked Questions
How does zvec handle memory limits in Docker containers?
Zvec reads the cgroup memory limit from /sys/fs/cgroup/memory/memory.limit_in_bytes via CgroupUtil::getMemoryLimit() and sets a soft cap at 80% of that value. This prevents the process from being killed by the OOM killer when running in Docker or Kubernetes environments.
What happens if an index builder exceeds its memory quota?
If an HNSW, IVF, or sparse index builder projects that its allocation will exceed the configured memory_quota parameter, it aborts early and returns an error. The builder continuously checks usage via formulas like node_size() * docs + neighbors_size_ * docs during construction.
Can I lock vector data in RAM to prevent swapping?
Yes. When configuring memory-mapped storage via zvec.init(), set mmap_params={"memory_locked": True}. This invokes mlock on the mapped pages via MMapFileStorage, keeping the data resident in physical memory and preventing swap-out.
How does zvec calculate memory usage for individual documents?
Each Doc object implements memory_usage(), which traverses the variant-based field map and sums the capacity of every stored string, vector, and nested structure. This per-document accounting is used by the buffer-store and memory-forward store to enforce segment-level quotas and prevent unbounded growth.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →