architecture

How Z-Vec Handles Sparse Vector Indexing and Search: Architecture and Implementation

February 16, 2026 alibaba/zvec ↗

Z-Vec stores sparse vectors as compact (count, indices, values) triplets and indexes them using a specialized HNSW-Sparse algorithm that enables efficient approximate nearest neighbor search without materializing dense dimensions.

Sparse vector indexing and search in the alibaba/zvec repository relies on a unified storage layout that spans from disk files to in-memory graph traversal. Every component—from the memory-mapped file reader to the HNSW builder—operates on the same count-indices-values contract, eliminating serialization overhead and enabling zero-copy access to high-dimensional sparse embeddings.

The Count-Indices-Values Storage Layout

Z-Vec treats a sparse vector as a triplet: an integer count of non-zero entries, an array of dimension indices, and an array of corresponding values. This layout is immutable across the storage hierarchy.

On-Disk Format in .vecs Files

Vectors are persisted in .vecs files with a structured header followed by three distinct blocks:

Dense vector block (if present)
Sparse meta block – contains the file offset for each vector’s data
Sparse data block – the actual count-indices-values payload

This separation allows the reader to locate a specific vector via the meta block without scanning the entire data section.

Memory-Mapped Access via VecsReader

The VecsReader class (and its SparseVecsReader variant) in tools/core/vecs_reader.h maps the .vecs file into process memory using mmap. It exposes three critical accessors:

get_sparse_count(idx) – returns the number of non-zero dimensions
get_sparse_indices(idx) – returns a pointer to the packed uint32_t index array
get_sparse_data(idx) – returns a pointer to the packed values (float16 or float32)

These methods operate at lines 124-132, 134-142, and 145-153 respectively, providing zero-copy views directly into the memory-mapped buffer.

In-Memory Sparse Vector Representation

Once loaded from disk, sparse vectors transition into C++ structures that maintain the same layout contract.

The SparseVector Struct

The public API defines SparseVector in src/include/zvec/core/interface/index.h (lines 47-60). This struct encapsulates:

count – the number of non-zero entries
indices – pointer to the dimension indices array
data – pointer to the values array
data_type – enum indicating float32 or float16

This structure is the bridge between the Python SDK and the C++ core.

IndexSparseHolder and Iterators

The IndexSparseHolder interface in src/include/zvec/core/framework/index_holder.h (lines 54-62) defines the storage contract for in-memory sparse vectors. It exposes an iterator API that returns:

sparse_count() – entry count
sparse_indices() – index array pointer
sparse_data() – value array pointer

Concrete implementations like OnePassIndexSparseHolder<float> store vectors as a list of <key, SparseVector<T>> pairs, enabling the HNSW builder to iterate over the dataset without type casting or format conversion.

Building the HNSW-Sparse Index

Z-Vec constructs approximate nearest neighbor graphs using a specialized HNSW implementation optimized for sparse data.

HnswSparseBuilder Implementation

The HnswSparseBuilder class in src/core/algorithm/hnsw_sparse/hnsw_sparse_builder.h (lines 45-55) receives an IndexSparseHolder and constructs the hierarchical navigable small-world graph. During construction:

It iterates over the holder using the sparse iterator API
Feeds each vector's count-indices-values triplet to HnswSparseAlgorithm
Builds the multi-layer graph structure without densifying the vectors

This approach maintains O(nnz) complexity per distance calculation, where nnz is the number of non-zero entries, rather than O(dim) for dense vectors.

Searching Sparse Vectors

Query execution follows the same sparse contract, ensuring that search inputs never require format conversion.

HnswSparseSearcher Query Flow

The HnswSparseSearcher in src/core/algorithm/hnsw_sparse/hnsw_sparse_searcher.h (lines 48-60) implements search_impl with a sparse-specific overload:

void search_impl(
    int sparse_count,
    const uint32_t* sparse_indices,
    const void* sparse_data,
    // ... result parameters
);

When a query arrives:

The searcher creates a searcher context initialized with the query's count-indices-values
Forwards the request to HnswSparseAlgorithm for graph traversal
Computes distances using sparse dot-product on the overlapping indices only
Converts internal doc IDs back to user-visible keys via the holder

This flow ensures that sparse queries execute with the same efficiency as the indexed vectors, using only the non-zero dimensions for distance calculations.

Python SDK Usage Examples

The Python SDK abstracts the C++ internals, allowing sparse vector operations through simple dictionaries.

Creating a Sparse Vector Collection

import zvec

# Define a schema with a sparse vector field (FP32)

schema = zvec.CollectionSchema(
    name="sparse_demo",
    vectors=zvec.VectorSchema(
        "sparse_vec",            # field name

        zvec.DataType.SPARSE_VECTOR_FP32,
        dim=0                     # dim is unused for sparse vectors

    )
)

coll = zvec.create_and_open(path="./sparse_demo", schema=schema)

Behind the scenes, VectorSchema triggers creation of an IndexSparseHolder, and the system selects HnswSparseBuilder as the default index builder for sparse data.

Inserting Sparse Documents

def make_sparse(vec):
    # vec is a dict {index: value}

    return {"sparse_vec": vec}

docs = [
    zvec.Doc(id="doc_A", vectors=make_sparse({1: 0.5, 3: 1.2, 7: 0.9})),
    zvec.Doc(id="doc_B", vectors=make_sparse({2: 1.0, 4: 0.7})),
]

coll.insert(docs)

The Python binding in python/zvec/extension/embedding_function.py (lines 115-119) converts the dictionary into a C++ SparseVector struct with count=3, indices=[1,3,7], and values=[0.5,1.2,0.9]. Index::_sparse_add then stores this in a OnePassIndexSparseHolder<float> and updates the HNSW graph via HnswSparseBuilder.

Querying with Sparse Vectors

query_vec = {"sparse_vec": {1: 0.4, 5: 0.8}}
results = coll.query(
    zvec.VectorQuery("sparse_vec", vector=query_vec["sparse_vec"]),
    topk=5
)

for r in results:
    print(r["id"], r["score"])

VectorQuery constructs a SparseVector from the input dict and invokes Index::_sparse_search. HnswSparseSearcher::search_impl receives the sparse triplet, traverses the graph using sparse dot-product calculations on overlapping indices only, and returns the top-k results.

Summary

Unified Layout: Z-Vec uses a count-indices-values triplet format consistently across disk storage (tools/core/vecs_reader.h), in-memory holders (src/include/zvec/core/framework/index_holder.h), and search algorithms (src/core/algorithm/hnsw_sparse/).
Zero-Copy Access: The VecsReader class memory-maps .vecs files and exposes get_sparse_count, get_sparse_indices, and get_sparse_data methods (lines 124-153) for direct pointer access to packed data without deserialization overhead.
HNSW-Sparse Algorithm: Dedicated builder (HnswSparseBuilder) and searcher (HnswSparseSearcher) classes implement approximate nearest neighbor search optimized for sparse vectors, computing distances in O(nnz) time using only non-zero overlapping dimensions.
Python SDK Abstraction: The public API hides C++ internals, accepting Python dict[int, float] objects that the binding layer (python/zvec/extension/embedding_function.py) converts to the internal SparseVector struct defined in src/include/zvec/core/interface/index.h.

Frequently Asked Questions

What file format does Z-Vec use to store sparse vectors on disk?

Z-Vec stores sparse vectors in .vecs files using a three-part layout: a header followed by a dense vector block (if applicable), a sparse meta block containing offsets per vector, and a sparse data block holding the actual count-indices-values payload. The VecsReader class in tools/core/vecs_reader.h memory-maps this format to provide zero-copy access.

How does Z-Vec calculate similarity between sparse vectors during search?

The HnswSparseSearcher computes similarity using sparse dot-product operations on the intersection of non-zero dimensions only. When traversing the HNSW graph, the algorithm receives the query vector's count, indices, and values pointers, then calculates distances in O(nnz) time by iterating only over overlapping indices, avoiding the O(dim) cost of dense vector comparison.

Can I use Z-Vec's sparse vector support from Python without managing C++ structs?

Yes. The Python SDK abstracts all C++ internals. You define sparse vectors as Python dictionaries where keys are integer dimensions and values are floats (e.g., {1: 0.5, 3: 1.2}). The binding layer in python/zvec/extension/embedding_function.py automatically converts these dictionaries into the internal SparseVector struct before passing them to the C++ core for indexing or search.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how alibaba/zvec works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →