deep-dive

ZVec Hybrid Search Capabilities: Dense-Sparse Retrieval with Built-In Rerankers

February 16, 2026 alibaba/zvec ↗

ZVec provides first-class hybrid search that unifies dense semantic similarity and sparse lexical matching through native C++ data structures, dual Python embedding functions, and built-in rerankers including Reciprocal Rank Fusion (RRF) and weighted linear fusion.

Alibaba's ZVec vector database engine delivers enterprise-grade hybrid search capabilities that eliminate the complexity of combining semantic and keyword retrieval. By embedding both dense and sparse representations into a single HybridVector container and exposing them through a unified Python SDK, ZVec enables high-recall lexical matching with BM25 alongside high-precision transformer-based similarity without external orchestration.

Core Architecture of Hybrid Search in ZVec

ZVec implements hybrid search as a native storage engine feature rather than an external composition layer. The architecture rests on specialized C++ templates that treat dense and sparse components as a unified logical vector.

HybridVector Container

The foundation of ZVec's hybrid search capabilities rests on the HybridVector<T> class defined in src/include/zvec/ailego/container/vector.h. This template extends NumericalVector to store both a dense floating-point array and a sparse dictionary mapping token indices to weights.

When you ingest a document, the Python SDK automatically packs the outputs from your dense and sparse embedding functions into this dual-structure container. This co-located storage eliminates network round-trips during retrieval and ensures that both representations remain synchronized throughout the index lifecycle.

IndexHybridHolder Framework

To manage collections of hybrid vectors, ZVec provides IndexHybridHolder in src/include/zvec/core/framework/index_holder.h. This holder extends the standard IndexHolder interface with hybrid-specific capabilities:

total_sparse_count() returns the aggregate number of non-zero sparse entries across all vectors, enabling the engine to pre-allocate traversal buffers.
create_hybrid_iterator() generates an iterator that simultaneously exposes both dense and sparse components during index scans.

This architecture allows ZVec to execute hybrid queries within a single index traversal pass, rather than requiring separate dense and sparse index lookups that must be joined externally.

Metric Constraints for Hybrid Vectors

ZVec enforces strict metric compatibility for hybrid search operations. The MipsEuclideanMetric implementation in src/core/metric/mips_euclidean_metric.cc explicitly validates that hybrid vectors use only inner-product or Euclidean distance metrics for the dense component.

This constraint ensures mathematical consistency when fusing dense similarity scores with sparse lexical scores, preventing metric mismatches that could distort the final ranking during hybrid retrieval.

Python SDK for Hybrid Retrieval

ZVec exposes its hybrid search capabilities through a Python SDK that abstracts the C++ internals while providing flexible embedding and reranking options.

Dual Embedding Functions

The SDK ships with complementary embedding functions in python/zvec/extension/ that generate the two halves of a hybrid vector:

Dense semantic vectors: DefaultLocalDenseEmbedding in python/zvec/extension/sentence_transformer_embedding_function.py produces transformer-based vectors (default 384 dimensions). Alternatively, QwenDenseEmbedding in python/zvec/extension/qwen_embedding_function.py provides LLM-based dense representations.
Sparse lexical vectors: BM25EmbeddingFunction in python/zvec/extension/bm25_embedding_function.py generates lexical vectors using DashText BM25, outputting dict[int, float] sparse representations. For learned sparse approaches, QwenSparseEmbedding in the same Qwen module provides alternative sparse encodings.

When you pass a dictionary containing both "dense" and "sparse" keys to the SDK, ZVec automatically constructs the underlying HybridVector<T> container.

Inserting Hybrid Documents

Adding hybrid-enabled documents to a collection follows the standard ZVec pattern, with the SDK handling the dual-vector packing:

from zvec import Collection
from zvec.extension import DefaultLocalDenseEmbedding, BM25EmbeddingFunction

dense = DefaultLocalDenseEmbedding()
sparse = BM25EmbeddingFunction(language="zh")

def hybrid_embed(text: str):
    return {
        "dense": dense.embed(text),
        "sparse": sparse.embed(text)
    }

coll = Collection(name="articles")
docs = [
    {"id": "1", "vectors": hybrid_embed("机器学习是人工智能的一个分支")},
    {"id": "2", "vectors": hybrid_embed("深度学习使用神经网络进行特征提取")},
]
coll.insert(docs)

The Doc objects created during insertion validate that both vector components are present, as demonstrated in python/tests/test_doc.py.

Querying with Built-In Rerankers

ZVec simplifies hybrid retrieval by providing built-in rerankers that fuse dense and sparse scores automatically. The Collection.query() method accepts a reranker parameter that controls the fusion strategy:

reranker="rrf": Applies Reciprocal Rank Fusion, computing score = Σ 1/(k + rank) with k=60 by default.
reranker="weighted": Performs linear interpolation final = α * dense_score + (1‑α) * sparse_score, where alpha defaults to 0.5.


# Hybrid query using Reciprocal Rank Fusion

results = coll.query(
    query=hybrid_embed("机器学习算法"),
    top_k=10,
    reranker="rrf"
)

# Hybrid query using weighted fusion with custom alpha

results = coll.query(
    query=hybrid_embed("神经网络训练"),
    top_k=10,
    reranker="weighted",
    alpha=0.7  # favor dense semantic similarity

)

These rerankers are exercised in the test suite at python/tests/test_collection.py, ensuring consistent behavior across dense-sparse fusion scenarios.

Advanced Fusion Techniques

For applications requiring custom scoring logic, ZVec exposes the individual dense and sparse result streams, enabling manual fusion outside the built-in rerankers.

Reciprocal Rank Fusion Internals

When you specify reranker="rrf", ZVec computes the fused score by summing the reciprocal ranks from both retrieval paths:


score = Σ 1/(k + rank_dense) + 1/(k + rank_sparse)

The constant k (default 60) prevents dominance by top-ranked items and provides numerical stability. This approach requires no training data and performs robustly across diverse query types, making it the default choice for zero-shot hybrid retrieval.

Weighted Linear Fusion Parameters

The weighted reranker treats the dense and sparse scores as independent relevance signals, combining them through affine transformation:


final_score = α * dense_score + (1 - α) * sparse_score

Setting alpha > 0.5 prioritizes semantic similarity, while alpha < 0.5 emphasizes exact lexical matches. This method assumes comparable score scales or requires normalization, which ZVec handles internally when using the built-in implementation.

Manual Score Blending

For complete control, retrieve the dense and sparse rankings separately and apply domain-specific fusion:


# Retrieve dense semantic matches

dense_hits = coll.query(
    query={"dense": dense.embed("机器学习算法")},
    top_k=10,
    reranker=None        # pure dense ranking

)

# Retrieve sparse lexical matches

sparse_hits = coll.query(
    query={"sparse": sparse.embed("机器学习算法")},
    top_k=10,
    reranker=None        # pure sparse (BM25) ranking

)

# Custom fusion: weighted blend with score normalization

alpha = 0.6
final_scores = {}
for doc_id in set(dense_hits) | set(sparse_hits):
    d_score = dense_hits.get(doc_id, {}).get('score', 0)
    s_score = sparse_hits.get(doc_id, {}).get('score', 0)
    final_scores[doc_id] = alpha * d_score + (1 - alpha) * s_score

ranked = sorted(final_scores.items(), key=lambda x: x[1], reverse=True)[:10]

This pattern leverages the fact that ZVec's Collection.query() accepts partial hybrid vectors (dense-only or sparse-only), allowing independent retrieval streams that you can fuse using application-specific logic.

Summary

ZVec delivers comprehensive hybrid search capabilities through a tightly integrated architecture that unifies dense semantic and sparse lexical retrieval:

Native C++ Integration: The HybridVector<T> container in src/include/zvec/ailego/container/vector.h and IndexHybridHolder in src/include/zvec/core/framework/index_holder.h store both vector types as a single logical unit, enabling single-pass index traversal.
Dual Embedding Ecosystem: Python functions in python/zvec/extension/ provide ready-to-use dense transformers (DefaultLocalDenseEmbedding, QwenDenseEmbedding) and sparse lexical models (BM25EmbeddingFunction, QwenSparseEmbedding).
Built-In Rerankers: The query API supports reranker="rrf" for Reciprocal Rank Fusion and reranker="weighted" for linear interpolation, eliminating the need for external reranking pipelines.
Metric Safety: The engine enforces inner-product or Euclidean metrics for hybrid vectors via MipsEuclideanMetric in src/core/metric/mips_euclidean_metric.cc, ensuring consistent similarity computation across dense and sparse components.

Frequently Asked Questions

What makes ZVec's hybrid search different from other vector databases?

Unlike solutions that orchestrate separate dense and sparse indexes externally, ZVec implements hybrid search as a native storage engine feature. The HybridVector<T> container and IndexHybridHolder framework store both representations in a single logical vector, allowing the C++ core to traverse both components in one index pass. This eliminates network round-trips and join operations required by loosely coupled hybrid systems.

Which embedding functions does ZVec support for hybrid retrieval?

ZVec provides complementary embedding functions in python/zvec/extension/ to generate hybrid vectors. For dense semantic embeddings, you can use DefaultLocalDenseEmbedding (sentence-transformer based) or QwenDenseEmbedding (LLM-based). For sparse lexical embeddings, ZVec offers BM25EmbeddingFunction (DashText BM25) and QwenSparseEmbedding for learned sparse representations. These functions output the dual-structure dictionaries that the SDK automatically packs into HybridVector<T> containers.

Can I customize the fusion weights in ZVec hybrid search?

Yes, ZVec exposes multiple strategies for combining dense and sparse scores. You can use reranker="rrf" for parameter-free Reciprocal Rank Fusion, or reranker="weighted" with an alpha parameter to control linear interpolation between the two signals. For complete customization, retrieve dense and sparse results separately using reranker=None and apply domain-specific fusion logic in Python, leveraging the fact that Collection.query() accepts partial hybrid vectors.

What metrics are supported for the dense component of hybrid vectors?

ZVec enforces strict metric constraints through the MipsEuclideanMetric class in src/core/metric/mips_euclidean_metric.cc. The dense component of a hybrid vector must use either inner-product or Euclidean distance metrics. This restriction ensures mathematical consistency when the engine fuses dense similarity scores with sparse lexical scores, preventing metric mismatches that could distort the final hybrid ranking.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how alibaba/zvec works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →