# Zvec Memory Footprint and Resource Management Strategy: A Complete Guide

> Understand zvec's memory footprint and resource management. Learn about its three-tier strategy including cgroup limits, component quotas, and memory-mapped I/O for efficient resource control.

- Repository: [Alibaba/zvec](https://github.com/alibaba/zvec)
- Tags: deep-dive
- Published: 2026-02-16

---

**Zvec controls memory usage through a three-tier strategy: global cgroup-aware soft limits (defaulting to 80% of available RAM), per-component quotas for index builders, and memory-mapped I/O with optional page locking, all implemented in the C++ core and exposed via Python's `zvec.init()`.**

Zvec is designed to run in-process while maintaining predictable memory consumption. Understanding zvec's memory footprint and resource management strategy is essential for deploying high-throughput vector search in containerized environments where cgroup limits apply. The system combines system-level caps, component-level quotas, and zero-copy I/O to keep RAM usage deterministic.

## Global Memory Limits and Cgroup Awareness

Zvec derives its memory ceiling from the host environment rather than assuming unlimited RAM.

### Soft Cap Calculation in GlobalConfig

At startup, `GlobalConfig::Initialize()` queries the cgroup memory limit via `CgroupUtil::getMemoryLimit()` and sets a soft cap defaulting to approximately 80% of that value. All internal components query `GlobalConfig::memory_limit_bytes()` to stay within this bound.

- Source: `src/db/common/config.cc` (lines 33-36)
- Source: `src/db/common/cgroup_util.cc`

### Cgroup Detection Implementation

The `CgroupUtil` class reads `/sys/fs/cgroup/memory/memory.limit_in_bytes` (or the cgroup v2 equivalent) to detect Docker or Kubernetes constraints. This ensures zvec respects container boundaries rather than being killed by the OOM killer.

## Per-Component Memory Quotas

Beyond global limits, individual algorithms enforce their own ceilings to prevent runaway allocation during index construction.

### HNSW Builder Memory Constraints

The HNSW builder accepts a `memory_quota` parameter (key: `proxima.hnsw.builder.memory_quota`). During construction, it continuously checks projected usage via `node_size() * docs + neighbors_size_ * docs`. If the quota would be exceeded, the builder aborts early with an error.

- Source: [`src/core/algorithm/hnsw/hnsw_builder_entity.h`](https://github.com/alibaba/zvec/blob/main/src/core/algorithm/hnsw/hnsw_builder_entity.h) (lines 94-95)
- Source: `src/core/algorithm/hnsw/hnsw_builder_entity.cc` (lines 67-71)

### IVF and Sparse Index Quotas

IVF and sparse index builders follow the same pattern, exposing a `memory_quota` parameter in their respective entity headers ([`src/core/algorithm/ivf/ivf_builder_entity.h`](https://github.com/alibaba/zvec/blob/main/src/core/algorithm/ivf/ivf_builder_entity.h)). This allows fine-grained control over RAM usage during the computationally intensive clustering phase.

## Zero-Copy I/O with Memory Mapping

Zvec minimizes its live heap by leveraging the operating system's virtual memory manager.

### MMapFileStorage Configuration

Vector data, meta-segments, and forward stores can be mapped directly into the process address space via `MMapFileStorage`. This avoids extra heap buffers for large read-only datasets. The implementation parses configuration parameters to control mapping behavior.

- Source: `src/core/utility/mmap_file_storage.cc`

### Warmup and Page Locking Options

Two optional parameters control mmap behavior:

- `memory_warmup`: When true, the mapping uses `MAP_POPULATE` to pre-fault pages into RAM, eliminating latency spikes on first access.
- `memory_locked`: When true, pages are locked with `mlock` to prevent swapping.

These are defined in [`src/core/utility/utility_params.h`](https://github.com/alibaba/zvec/blob/main/src/core/utility/utility_params.h) and parsed in `mmap_file_storage.cc` (lines 60-64).

## Thread Pool and Document Accounting

Resource management extends beyond RAM to execution contexts and per-document overhead.

### GlobalResource Thread Management

A singleton `GlobalResource` lazily creates two thread pools—one for query execution and one for background index optimization—sized according to the cgroup CPU limit detected at startup. This prevents oversubscription in containerized environments.

- Source: [`src/db/common/global_resource.h`](https://github.com/alibaba/zvec/blob/main/src/db/common/global_resource.h) and `src/db/common/global_resource.cc`

### Per-Document Memory Tracking

Every `Doc` object implements `memory_usage()`, which walks the variant-based field map and sums the capacity of every stored string, vector, and sub-structure. This accurate per-document estimate is used by the buffer-store and memory-forward store to enforce quotas.

- Source: `src/db/index/common/doc.cc` (lines 12-22)

## Configuring Zvec Memory in Python

The Python API exposes all native resource controls through the `zvec.init()` function.

```python
import zvec

# Initialize with explicit memory constraints

zvec.init(
    path="./my_collection",
    memory_limit_mb=2048,          # Soft cap: 2 GiB

    query_thread_count=4,          # Override cgroup detection

    enable_mmap=True,              # Use memory-mapped files

    mmap_params={
        "memory_warmup": True,    # Pre-fault pages

        "memory_locked": False    # Do not mlock

    }
)

```

The `memory_limit_mb` parameter maps to `GlobalConfig::memory_limit_bytes()`, while `enable_mmap` and `mmap_params` configure the `MMapFileStorage` behavior described earlier.

## Summary

- **Global soft limits**: Zvec defaults to 80% of cgroup memory, queryable via `GlobalConfig::memory_limit_bytes()`.
- **Component quotas**: Index builders enforce hard `memory_quota` caps to prevent runaway allocation during training.
- **Zero-copy I/O**: Memory-mapped storage with optional `memory_warmup` and `memory_locked` minimizes heap usage and latency.
- **Thread control**: `GlobalResource` sizes thread pools based on cgroup CPU limits to prevent oversubscription.
- **Fine-grained accounting**: `Doc::memory_usage()` provides accurate per-document memory estimates for store management.

## Frequently Asked Questions

### How does zvec handle memory limits in Docker containers?

Zvec reads the cgroup memory limit from `/sys/fs/cgroup/memory/memory.limit_in_bytes` via `CgroupUtil::getMemoryLimit()` and sets a soft cap at 80% of that value. This prevents the process from being killed by the OOM killer when running in Docker or Kubernetes environments.

### What happens if an index builder exceeds its memory quota?

If an HNSW, IVF, or sparse index builder projects that its allocation will exceed the configured `memory_quota` parameter, it aborts early and returns an error. The builder continuously checks usage via formulas like `node_size() * docs + neighbors_size_ * docs` during construction.

### Can I lock vector data in RAM to prevent swapping?

Yes. When configuring memory-mapped storage via `zvec.init()`, set `mmap_params={"memory_locked": True}`. This invokes `mlock` on the mapped pages via `MMapFileStorage`, keeping the data resident in physical memory and preventing swap-out.

### How does zvec calculate memory usage for individual documents?

Each `Doc` object implements `memory_usage()`, which traverses the variant-based field map and sums the capacity of every stored string, vector, and nested structure. This per-document accounting is used by the buffer-store and memory-forward store to enforce segment-level quotas and prevent unbounded growth.