# ZVec Hybrid Search Capabilities: Dense-Sparse Retrieval with Built-In Rerankers > Explore ZVec's hybrid search: unify dense semantic and sparse lexical matching. Discover built-in rerankers like RRF and weighted fusion for superior retrieval. - Repository: [Alibaba/zvec](https://github.com/alibaba/zvec) - Tags: deep-dive - Published: 2026-02-16 --- **ZVec provides first-class hybrid search that unifies dense semantic similarity and sparse lexical matching through native C++ data structures, dual Python embedding functions, and built-in rerankers including Reciprocal Rank Fusion (RRF) and weighted linear fusion.** Alibaba's ZVec vector database engine delivers enterprise-grade hybrid search capabilities that eliminate the complexity of combining semantic and keyword retrieval. By embedding both dense and sparse representations into a single `HybridVector` container and exposing them through a unified Python SDK, ZVec enables high-recall lexical matching with BM25 alongside high-precision transformer-based similarity without external orchestration. ## Core Architecture of Hybrid Search in ZVec ZVec implements hybrid search as a native storage engine feature rather than an external composition layer. The architecture rests on specialized C++ templates that treat dense and sparse components as a unified logical vector. ### HybridVector Container The foundation of ZVec's hybrid search capabilities rests on the `HybridVector` class defined in [`src/include/zvec/ailego/container/vector.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/ailego/container/vector.h). This template extends `NumericalVector` to store both a dense floating-point array and a sparse dictionary mapping token indices to weights. When you ingest a document, the Python SDK automatically packs the outputs from your dense and sparse embedding functions into this dual-structure container. This co-located storage eliminates network round-trips during retrieval and ensures that both representations remain synchronized throughout the index lifecycle. ### IndexHybridHolder Framework To manage collections of hybrid vectors, ZVec provides `IndexHybridHolder` in [`src/include/zvec/core/framework/index_holder.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/core/framework/index_holder.h). This holder extends the standard `IndexHolder` interface with hybrid-specific capabilities: - `total_sparse_count()` returns the aggregate number of non-zero sparse entries across all vectors, enabling the engine to pre-allocate traversal buffers. - `create_hybrid_iterator()` generates an iterator that simultaneously exposes both dense and sparse components during index scans. This architecture allows ZVec to execute hybrid queries within a single index traversal pass, rather than requiring separate dense and sparse index lookups that must be joined externally. ### Metric Constraints for Hybrid Vectors ZVec enforces strict metric compatibility for hybrid search operations. The `MipsEuclideanMetric` implementation in `src/core/metric/mips_euclidean_metric.cc` explicitly validates that hybrid vectors use only **inner-product** or **Euclidean** distance metrics for the dense component. This constraint ensures mathematical consistency when fusing dense similarity scores with sparse lexical scores, preventing metric mismatches that could distort the final ranking during hybrid retrieval. ## Python SDK for Hybrid Retrieval ZVec exposes its hybrid search capabilities through a Python SDK that abstracts the C++ internals while providing flexible embedding and reranking options. ### Dual Embedding Functions The SDK ships with complementary embedding functions in `python/zvec/extension/` that generate the two halves of a hybrid vector: - **Dense semantic vectors**: `DefaultLocalDenseEmbedding` in [`python/zvec/extension/sentence_transformer_embedding_function.py`](https://github.com/alibaba/zvec/blob/main/python/zvec/extension/sentence_transformer_embedding_function.py) produces transformer-based vectors (default 384 dimensions). Alternatively, `QwenDenseEmbedding` in [`python/zvec/extension/qwen_embedding_function.py`](https://github.com/alibaba/zvec/blob/main/python/zvec/extension/qwen_embedding_function.py) provides LLM-based dense representations. - **Sparse lexical vectors**: `BM25EmbeddingFunction` in [`python/zvec/extension/bm25_embedding_function.py`](https://github.com/alibaba/zvec/blob/main/python/zvec/extension/bm25_embedding_function.py) generates lexical vectors using DashText BM25, outputting `dict[int, float]` sparse representations. For learned sparse approaches, `QwenSparseEmbedding` in the same Qwen module provides alternative sparse encodings. When you pass a dictionary containing both `"dense"` and `"sparse"` keys to the SDK, ZVec automatically constructs the underlying `HybridVector` container. ### Inserting Hybrid Documents Adding hybrid-enabled documents to a collection follows the standard ZVec pattern, with the SDK handling the dual-vector packing: ```python from zvec import Collection from zvec.extension import DefaultLocalDenseEmbedding, BM25EmbeddingFunction dense = DefaultLocalDenseEmbedding() sparse = BM25EmbeddingFunction(language="zh") def hybrid_embed(text: str): return { "dense": dense.embed(text), "sparse": sparse.embed(text) } coll = Collection(name="articles") docs = [ {"id": "1", "vectors": hybrid_embed("机器学习是人工智能的一个分支")}, {"id": "2", "vectors": hybrid_embed("深度学习使用神经网络进行特征提取")}, ] coll.insert(docs) ``` The `Doc` objects created during insertion validate that both vector components are present, as demonstrated in [`python/tests/test_doc.py`](https://github.com/alibaba/zvec/blob/main/python/tests/test_doc.py). ### Querying with Built-In Rerankers ZVec simplifies hybrid retrieval by providing built-in rerankers that fuse dense and sparse scores automatically. The `Collection.query()` method accepts a `reranker` parameter that controls the fusion strategy: - **`reranker="rrf"`**: Applies **Reciprocal Rank Fusion**, computing `score = Σ 1/(k + rank)` with `k=60` by default. - **`reranker="weighted"`**: Performs linear interpolation `final = α * dense_score + (1‑α) * sparse_score`, where `alpha` defaults to 0.5. ```python # Hybrid query using Reciprocal Rank Fusion results = coll.query( query=hybrid_embed("机器学习算法"), top_k=10, reranker="rrf" ) # Hybrid query using weighted fusion with custom alpha results = coll.query( query=hybrid_embed("神经网络训练"), top_k=10, reranker="weighted", alpha=0.7 # favor dense semantic similarity ) ``` These rerankers are exercised in the test suite at [`python/tests/test_collection.py`](https://github.com/alibaba/zvec/blob/main/python/tests/test_collection.py), ensuring consistent behavior across dense-sparse fusion scenarios. ## Advanced Fusion Techniques For applications requiring custom scoring logic, ZVec exposes the individual dense and sparse result streams, enabling manual fusion outside the built-in rerankers. ### Reciprocal Rank Fusion Internals When you specify `reranker="rrf"`, ZVec computes the fused score by summing the reciprocal ranks from both retrieval paths: ``` score = Σ 1/(k + rank_dense) + 1/(k + rank_sparse) ``` The constant `k` (default 60) prevents dominance by top-ranked items and provides numerical stability. This approach requires no training data and performs robustly across diverse query types, making it the default choice for zero-shot hybrid retrieval. ### Weighted Linear Fusion Parameters The weighted reranker treats the dense and sparse scores as independent relevance signals, combining them through affine transformation: ``` final_score = α * dense_score + (1 - α) * sparse_score ``` Setting `alpha > 0.5` prioritizes semantic similarity, while `alpha < 0.5` emphasizes exact lexical matches. This method assumes comparable score scales or requires normalization, which ZVec handles internally when using the built-in implementation. ### Manual Score Blending For complete control, retrieve the dense and sparse rankings separately and apply domain-specific fusion: ```python # Retrieve dense semantic matches dense_hits = coll.query( query={"dense": dense.embed("机器学习算法")}, top_k=10, reranker=None # pure dense ranking ) # Retrieve sparse lexical matches sparse_hits = coll.query( query={"sparse": sparse.embed("机器学习算法")}, top_k=10, reranker=None # pure sparse (BM25) ranking ) # Custom fusion: weighted blend with score normalization alpha = 0.6 final_scores = {} for doc_id in set(dense_hits) | set(sparse_hits): d_score = dense_hits.get(doc_id, {}).get('score', 0) s_score = sparse_hits.get(doc_id, {}).get('score', 0) final_scores[doc_id] = alpha * d_score + (1 - alpha) * s_score ranked = sorted(final_scores.items(), key=lambda x: x[1], reverse=True)[:10] ``` This pattern leverages the fact that ZVec's `Collection.query()` accepts partial hybrid vectors (dense-only or sparse-only), allowing independent retrieval streams that you can fuse using application-specific logic. ## Summary ZVec delivers comprehensive hybrid search capabilities through a tightly integrated architecture that unifies dense semantic and sparse lexical retrieval: - **Native C++ Integration**: The `HybridVector` container in [`src/include/zvec/ailego/container/vector.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/ailego/container/vector.h) and `IndexHybridHolder` in [`src/include/zvec/core/framework/index_holder.h`](https://github.com/alibaba/zvec/blob/main/src/include/zvec/core/framework/index_holder.h) store both vector types as a single logical unit, enabling single-pass index traversal. - **Dual Embedding Ecosystem**: Python functions in `python/zvec/extension/` provide ready-to-use dense transformers (`DefaultLocalDenseEmbedding`, `QwenDenseEmbedding`) and sparse lexical models (`BM25EmbeddingFunction`, `QwenSparseEmbedding`). - **Built-In Rerankers**: The query API supports `reranker="rrf"` for Reciprocal Rank Fusion and `reranker="weighted"` for linear interpolation, eliminating the need for external reranking pipelines. - **Metric Safety**: The engine enforces inner-product or Euclidean metrics for hybrid vectors via `MipsEuclideanMetric` in `src/core/metric/mips_euclidean_metric.cc`, ensuring consistent similarity computation across dense and sparse components. ## Frequently Asked Questions ### What makes ZVec's hybrid search different from other vector databases? Unlike solutions that orchestrate separate dense and sparse indexes externally, ZVec implements hybrid search as a native storage engine feature. The `HybridVector` container and `IndexHybridHolder` framework store both representations in a single logical vector, allowing the C++ core to traverse both components in one index pass. This eliminates network round-trips and join operations required by loosely coupled hybrid systems. ### Which embedding functions does ZVec support for hybrid retrieval? ZVec provides complementary embedding functions in `python/zvec/extension/` to generate hybrid vectors. For dense semantic embeddings, you can use `DefaultLocalDenseEmbedding` (sentence-transformer based) or `QwenDenseEmbedding` (LLM-based). For sparse lexical embeddings, ZVec offers `BM25EmbeddingFunction` (DashText BM25) and `QwenSparseEmbedding` for learned sparse representations. These functions output the dual-structure dictionaries that the SDK automatically packs into `HybridVector` containers. ### Can I customize the fusion weights in ZVec hybrid search? Yes, ZVec exposes multiple strategies for combining dense and sparse scores. You can use `reranker="rrf"` for parameter-free Reciprocal Rank Fusion, or `reranker="weighted"` with an `alpha` parameter to control linear interpolation between the two signals. For complete customization, retrieve dense and sparse results separately using `reranker=None` and apply domain-specific fusion logic in Python, leveraging the fact that `Collection.query()` accepts partial hybrid vectors. ### What metrics are supported for the dense component of hybrid vectors? ZVec enforces strict metric constraints through the `MipsEuclideanMetric` class in `src/core/metric/mips_euclidean_metric.cc`. The dense component of a hybrid vector must use either **inner-product** or **Euclidean** distance metrics. This restriction ensures mathematical consistency when the engine fuses dense similarity scores with sparse lexical scores, preventing metric mismatches that could distort the final hybrid ranking.