# Creating and Optimizing Vector Indexes in Alibaba ZVec: A Complete Guide

> Learn to create and optimize vector indexes in Alibaba ZVec. This guide covers configuration, building, and querying for Flat, HNSW, and IVF structures with quantization and multi-threading.

- Repository: [Alibaba/zvec](https://github.com/alibaba/zvec)
- Tags: how-to-guide
- Published: 2026-02-16

---

**Alibaba ZVec creates optimized vector indexes through a three-stage pipeline—configuration via `IndexParam`, building via `local_builder.cc` and `IndexStreamer`, and querying via `IndexFactory`—supporting Flat, HNSW, and IVF structures with quantization and multi-threading optimizations.**

ZVec is Alibaba's high-performance vector database designed for billion-scale approximate nearest neighbor (ANN) search. Whether you are building a recommendation engine or a semantic search pipeline, creating and optimizing vector indexes in Alibaba ZVec requires understanding its C++ core architecture and the flexible YAML-based configuration system.

## Understanding the ZVec Index Architecture

ZVec implements a modular index architecture defined in [`src/include/zvec/core/interface/index_param.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/core/interface/index_param.h). The system supports three primary index types enumerated in the `IndexType` enum (lines 56–64): `kFlat` for exact brute-force search, `kHNSW` for high-recall approximate search, and `kIVF` for inverted file indexes suitable for billion-scale datasets.

### Core Components and File Paths

The index lifecycle is managed by three core components:

- **`IndexParam`** ([`src/include/zvec/core/interface/index_param.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/core/interface/index_param.h)): Defines index type, metric, quantizer, and runtime options.
- **`IndexFactory`** (`src/core/interface/index_factory.cc`): Instantiates concrete index classes via `IndexFactory::CreateAndInitIndex` (lines 41–47).
- **`IndexStreamer`** ([`src/include/zvec/core/framework/index_streamer.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/core/framework/index_streamer.h)): Abstract runner that streams vectors into the index and handles dumping to storage via `open`, `flush`, and `close` methods.

## Creating Vector Indexes in Alibaba ZVec

Index creation follows a three-stage pipeline: configuration, building, and querying.

### Stage 1: Configuration with IndexParam

Index creation begins with a YAML or JSON configuration file that populates the `IndexParam` structure. Key parameters include:

- **`BuilderClass`**: Specifies the streamer implementation (`HnswStreamer`, `FlatStreamer`, or `IvfStreamer`).
- **`MetricName`**: Distance metric (`L2sq`, `Cosine`, `InnerProduct`).
- **`ConverterName`**: Quantization method (`Int8`, `PQ`, `OPQ`).
- **`ThreadCount`**: Builder parallelism (defaults to hardware concurrency).
- **`NeedTrain`**: Boolean flag indicating whether a quantizer requires a separate training phase using `TrainerIndexPath`.

### Stage 2: Building the Index with Local Builder

The C++ entry point `tools/core/local_builder.cc` orchestrates the build process. The core logic creates the index via the factory, initializes storage, and streams vectors through a multi-threaded pipeline:

```cpp
// Parse YAML into ailego::Params
bool prepare_params(YAML::Node &&config_params, ailego::Params &params);

// Create and initialize index via factory
Index::Pointer index = core_interface::IndexFactory::CreateAndInitIndex(param);

// Initialize MMapFileStorage for zero-copy reads
auto storage = IndexFactory::CreateStorage("MMapFileStorage");
storage->open(dump_path, true);
auto dumper = IndexFactory::CreateDumper(storage);

// Create and configure streamer
IndexStreamer::Pointer streamer = IndexFactory::CreateStreamer("HnswStreamer");
streamer->init(meta, builder_params);
streamer->open(storage);

// Execute multi-threaded build loop
do_build_sparse_by_streamer(streamer, thread_count);
streamer->flush(check_point);
streamer->close();

```

The `do_build_sparse_by_streamer` function (lines 252–260) distributes vector IDs across a thread pool, optionally applies a reformer, and feeds each vector to `streamer->add_impl`.

### Stage 3: Querying with IndexStreamer

Once built, the index is loaded via `IndexStreamer::open` and queried using parameters serialized through `IndexFactory::QueryParamSerializeToJson` (lines 141–150). The Python wrapper in [`python/zvec/zvec.py`](https://github.com/alibaba/zvec/blob/main/python/zvec/zvec.py) provides high-level access to this functionality.

## Optimizing Vector Indexes in Alibaba ZVec

Optimization in ZVec targets memory efficiency, build speed, and query latency through five primary mechanisms.

### Quantization Strategies

Set `ConverterName` in the builder config to enable quantization. The `QuantizerParam` struct in [`src/include/zvec/core/interface/index_param.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/core/interface/index_param.h) (lines 86–124) supports:

- **`Int8`**: 8-bit integer quantization reducing memory footprint by 75%.
- **`PQ`**: Product Quantization for high compression ratios.
- **`OPQ`**: Optimized Product Quantization with rotation preprocessing.

When `NeedTrain` is `true`, the builder executes a training phase using `TrainerIndexPath` to learn quantization codebooks before the main build.

### Parallelism and Threading

ZVec leverages `ailego::ThreadPool` ([`src/include/zvec/ailego/parallel/thread_pool.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/ailego/parallel/thread_pool.h)) to parallelize vector ingestion. The `ThreadCount` parameter in YAML controls parallelism, with linear speed-up observed until memory bandwidth saturates. The `do_build_sparse_by_streamer` function automatically partitions work across the pool.

### Reformers and Transformations

A **reformer** preprocesses vectors before indexing. Configure via `BuilderCommon.ReformerName` in YAML. The reformer is instantiated via `IndexFactory::CreateReformer(meta.reformer_name())` (`local_builder.cc` lines 38–44). Common options include PCA for dimensionality reduction and OPQ for rotation optimization.

### Storage Layout Optimization

ZVec uses `MMapFileStorage` by default for zero-copy reads during querying. For ultra-low latency scenarios, enable `kBufferPool` in `StorageOptions` ([`index_param.h`](https://github.com/alibaba/zvec/blob/main/index_param.h) lines 37–44) to keep hot index pages in memory rather than mapped files.

### Search-Time Tuning

For HNSW indexes, the `ef_search` parameter in `HNSWQueryParam` controls the recall/latency trade-off. This is serialized via `IndexFactory::QueryParamSerializeToJson` and exposed in Python as the `ef_search` argument to `collection.search()`. Higher values improve recall at the cost of increased query latency.

## Working with ZVec in Python

The Python binding in [`python/zvec/zvec.py`](https://github.com/alibaba/zvec/blob/main/python/zvec/zvec.py) mirrors the C++ pipeline:

```python
import zvec
from zvec.model import Collection, CollectionSchema, FieldSchema, DataType

# Initialize the engine

zvec.init(log_type=zvec.LogType.CONSOLE, log_level=zvec.LogLevel.INFO)

# Create collection with schema

schema = CollectionSchema(
    name="my_vectors",
    fields=[
        FieldSchema("id", DataType.INT64, nullable=False),
        FieldSchema("vec", DataType.FLOAT_VECTOR, dimension=128)
    ],
)
collection = zvec.create_and_open("./my_collection", schema)

# Insert vectors

ids = [1, 2, 3]
vectors = [[0.1]*128, [0.2]*128, [0.3]*128]
collection.insert(ids, {"vec": vectors})

# Rebuild with custom HNSW parameters

collection.rebuild_index(
    index_type=zvec.IndexType.HNSW,
    metric=zvec.MetricType.COSINE,
    ef_construction=200,
    nlist=4096,
    quantizer="Int8",
)

# Search with tuned parameters

results = collection.search(
    vectors=[[0.15]*128],
    topk=5,
    ef_search=100,
    metric=zvec.MetricType.COSINE,
)
print(results)

```

## End-to-End Example: Building and Querying an HNSW Index

**Step 1: Prepare the build configuration**

Create [`build.yaml`](https://github.com/alibaba/zvec/blob/main/build.yaml):

```yaml
BuilderCommon:
  BuilderClass: HnswStreamer
  BuildFile: ./data/vecs/train.vecs
  IndexPath: ./data/vecs/train.index
  DumpPath: ./data/vecs/train.dump.index
  ConverterName: Int8
  MetricName: Cosine
  ThreadCount: 8
  NeedTrain: true
  TrainFile: ./data/vecs/train.vecs
  TrainerIndexPath: ./data/vecs/train.trainer.index
BuilderParams:
  proxima.hnsw.builder.thread_count: !!int 8
  proxima.hnsw.builder.ef_construction: !!int 200

```

**Step 2: Execute the C++ builder**

```bash
./build/bin/local_build_original build.yaml

```

**Step 3: Query from Python**

```python
import zvec
zvec.init()
coll = zvec.open("./data/vecs/train.index")
hits = coll.search([[0.01]*128], topk=10, ef_search=150, metric=zvec.MetricType.COSINE)
print(hits)

```

## Summary

- ZVec supports **Flat**, **HNSW**, and **IVF** index types through the `IndexFactory::CreateAndInitIndex` method in `src/core/interface/index_factory.cc`.
- Index creation follows a three-stage pipeline: **Configuration** (`IndexParam`), **Building** (`local_builder.cc` and `IndexStreamer`), and **Querying** (`IndexStreamer::open`).
- Optimization strategies include **quantization** (`Int8`, `PQ`, `OPQ`), **multi-threading** via `ailego::ThreadPool`, **reformers** for vector transformation, and **storage layout** tuning (`MMapFileStorage` vs `kBufferPool`).
- The Python API in [`python/zvec/zvec.py`](https://github.com/alibaba/zvec/blob/main/python/zvec/zvec.py) exposes `rebuild_index` and `search` methods with parameters like `ef_construction` and `ef_search` for fine-grained control.

## Frequently Asked Questions

### What index types does Alibaba ZVec support?

ZVec supports three primary index types defined in the `IndexType` enum within [`src/include/zvec/core/interface/index_param.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/core/interface/index_param.h) (lines 56–64): **Flat** (`kFlat`) for exact brute-force search, **HNSW** (`kHNSW`) for graph-based approximate nearest neighbor search with sub-millisecond latency, and **IVF** (`kIVF`) for inverted file indexes optimized for billion-scale datasets. The `IndexFactory::CreateAndInitIndex` method in `src/core/interface/index_factory.cc` instantiates the appropriate implementation class based on the configuration.

### How do I enable quantization when creating a vector index in ZVec?

Enable quantization by setting the `ConverterName` field in your YAML configuration to `Int8`, `PQ`, or `OPQ`. This parameter maps to the `QuantizerParam` struct defined in [`src/include/zvec/core/interface/index_param.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/core/interface/index_param.h) (lines 86–124). When `NeedTrain` is set to `true`, the builder executes a training phase using `TrainerIndexPath` to learn quantization codebooks before the main build process begins in `local_builder.cc`.

### What is the difference between ef_construction and ef_search in ZVec HNSW indexes?

`ef_construction` controls the quality of the HNSW graph during the build phase, specified in `BuilderParams` within your YAML configuration (e.g., `proxima.hnsw.builder.ef_construction: 200`). Higher values create denser graphs with better recall but slower build times. Conversely, `ef_search` is a query-time parameter in `HNSWQueryParam` that determines the size of the dynamic candidate list during search; it is serialized via `IndexFactory::QueryParamSerializeToJson` (lines 141–150) and exposed in Python as the `ef_search` argument to `collection.search()`, allowing per-query tuning of the recall-latency trade-off.

### How does ZVec handle multi-threading during index construction?

ZVec leverages `ailego::ThreadPool` defined in [`src/include/zvec/ailego/parallel/thread_pool.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/ailego/parallel/thread_pool.h) to parallelize vector ingestion. The `do_build_sparse_by_streamer` function in `tools/core/local_builder.cc` (lines 252–260) distributes vector IDs across the thread pool, where each thread optionally applies a reformer and feeds vectors to `streamer->add_impl`. The `ThreadCount` parameter in the YAML configuration controls the degree of parallelism, with linear speed-up typically observed until memory bandwidth becomes the bottleneck.