Implementing Hybrid Search with Semantic Similarity and Filters in Alibaba zvec
Hybrid search in Alibaba zvec combines dense semantic vectors with sparse lexical vectors using multiple VectorQuery objects, pre-filters candidates via Boolean expressions, and fuses results through RrfReRanker or WeightedReRanker to return the most relevant documents.
Hybrid search merges the contextual understanding of dense embeddings with the precision of sparse term matching. In Alibaba's zvec vector database, this capability is exposed through the Collection.query interface in python/zvec/model/collection.py, which accepts multiple VectorQuery objects—each representing a different embedding space—while supporting scalar field filters to narrow the candidate pool before similarity scoring.
How Hybrid Search Works in zvec
The architecture follows a multi-stage pipeline. First, embedding functions convert the query text into dense and sparse vector representations. These vectors are packaged into VectorQuery objects that target specific fields in the collection. The Collection.query method dispatches these queries to the underlying engine, which retrieves candidates for each vector type. Finally, a reranker such as RrfReRanker or WeightedReRanker fuses the per-field result lists into a single ranked output.
Core Components for Hybrid Search
Several classes in the zvec Python SDK work together to enable hybrid search:
Collection– The primary interface inpython/zvec/model/collection.pyproviding thequery()method that orchestrates multi-vector searches and filter application.VectorQuery– Defined inpython/zvec/model/param/vector_query.py, encapsulates a single vector search request including the field name, vector data, and index-specific parameters likeHnswQueryParamorIVFQueryParam.- Embedding Functions –
DefaultLocalDenseEmbeddingandDefaultLocalSparseEmbeddinginpython/zvec/extension/sentence_transformer_embedding_function.pyhandle dense and SPLADE-based sparse encoding, whileBM25EmbeddingFunctioninpython/zvec/extension/bm25_embedding_function.pyprovides lexical sparse vectors. - Rerankers –
RrfReRankerandWeightedReRankerinpython/zvec/extension/multi_vector_reranker.pyimplement Reciprocal Rank Fusion and weighted score aggregation for combining results from multiple vector fields.
Step-by-Step Implementation
Creating Dense and Sparse Embeddings
First, instantiate the embedding functions. The dense embedder uses sentence-transformers (MiniLM-L6-v2 by default), while the sparse embedder can use SPLADE or BM25.
from zvec.extension import (
DefaultLocalDenseEmbedding,
DefaultLocalSparseEmbedding,
)
# Dense semantic embedding
dense_emb = DefaultLocalDenseEmbedding()
# Sparse lexical embedding (SPLADE-based)
sparse_emb = DefaultLocalSparseEmbedding()
query_text = "machine learning algorithms for recommendation"
dense_vec = dense_emb.embed(query_text) # Returns list[float]
sparse_vec = sparse_emb.embed(query_text) # Returns dict[int, float]
Building VectorQuery Objects
Wrap each vector in a VectorQuery targeting its respective field in the collection schema.
from zvec.model.param.vector_query import VectorQuery
dense_q = VectorQuery(field_name="dense", vector=dense_vec)
sparse_q = VectorQuery(field_name="sparse", vector=sparse_vec)
Applying Scalar Filters
Use the filter parameter in Collection.query to pre-filter candidates using Boolean expressions on scalar fields before vector similarity is computed.
# Filter expression syntax supports ==, !=, <, >, <=, >=, AND, OR, NOT
filter_expr = "category == 'technology' AND publish_year >= 2023"
results = collection.query(
vectors=[dense_q, sparse_q],
topk=20,
filter=filter_expr,
)
Fusing Results with Rerankers
When multiple VectorQuery objects are provided, the engine returns per-field candidate lists. Use WeightedReRanker for explicit weighting or RrfReRanker for rank-based fusion without manual tuning.
from zvec.extension import WeightedReRanker, RrfReRanker
# Option A: Weighted fusion (normalize scores and apply weights)
weighted_reranker = WeightedReRanker(
topn=20,
metric="L2", # Must match the distance metric used by the index
weights={"dense": 0.7, "sparse": 0.3},
)
# Option B: Reciprocal Rank Fusion (no weights needed)
rrf_reranker = RrfReRanker(topn=20, rank_constant=60)
results = collection.query(
vectors=[dense_q, sparse_q],
topk=20,
filter=filter_expr,
reranker=weighted_reranker, # or rrf_reranker
)
Complete Code Example
Here is a complete implementation combining dense semantic search, sparse lexical search, scalar filtering, and weighted reranking:
from zvec import Collection
from zvec.model.param.vector_query import VectorQuery
from zvec.extension import (
DefaultLocalDenseEmbedding,
DefaultLocalSparseEmbedding,
WeightedReRanker,
)
# Initialize collection and embedding functions
collection = Collection.load("product_catalog")
dense_emb = DefaultLocalDenseEmbedding()
sparse_emb = DefaultLocalSparseEmbedding()
# Encode query
query = "wireless noise cancelling headphones"
dense_vec = dense_emb.embed(query)
sparse_vec = sparse_emb.embed(query)
# Build vector queries targeting different fields
dense_q = VectorQuery(field_name="dense_desc", vector=dense_vec)
sparse_q = VectorQuery(field_name="sparse_desc", vector=sparse_vec)
# Define scalar filter for pre-filtering
filter_expr = "category == 'electronics' AND price < 300"
# Configure weighted reranker (70% semantic, 30% lexical)
reranker = WeightedReRanker(
topn=10,
metric="L2",
weights={"dense_desc": 0.7, "sparse_desc": 0.3},
)
# Execute hybrid search
results = collection.query(
vectors=[dense_q, sparse_q],
topk=10,
filter=filter_expr,
reranker=reranker,
include_vector=False,
)
# Display results
for doc in results:
print(f"ID: {doc.id}, Score: {doc.score:.4f}")
Key Source Files
The hybrid search implementation relies on these specific source files in the Alibaba zvec repository:
python/zvec/model/collection.py– Contains theCollectionclass and itsquery()method that orchestrates multi-vector searches and filter application.python/zvec/model/param/vector_query.py– Defines theVectorQuerydataclass used to package individual vector search requests with optional index-specific parameters.python/zvec/extension/sentence_transformer_embedding_function.py– ImplementsDefaultLocalDenseEmbeddingandDefaultLocalSparseEmbeddingfor sentence-transformer based encoding.python/zvec/extension/bm25_embedding_function.py– ProvidesBM25EmbeddingFunctionfor lexical sparse embeddings.python/zvec/extension/multi_vector_reranker.py– ContainsRrfReRankerandWeightedReRankerfor fusing results from multiple vector fields.python/tests/test_collection.pyandpython/tests/test_doc.py– Unit tests demonstrating hybrid vector usage, filters, and rerankers.
Summary
- Hybrid search in zvec combines dense semantic vectors with sparse lexical vectors to balance contextual understanding with exact term matching.
- Use
VectorQueryobjects to target different fields in a singleCollection.query()call, as implemented inpython/zvec/model/collection.py. - Apply scalar filters via the
filterparameter using Boolean expressions to pre-filter candidates before vector similarity scoring. - Fuse multi-field results using
RrfReRanker(Reciprocal Rank Fusion) orWeightedReRanker(score-based weighting) frompython/zvec/extension/multi_vector_reranker.py. - The modular architecture supports swapping embedding backends—including
DefaultLocalDenseEmbedding,DefaultLocalSparseEmbedding, andBM25EmbeddingFunction—without changing the core search logic.
Frequently Asked Questions
What is the difference between RrfReRanker and WeightedReRanker?
RrfReRanker uses Reciprocal Rank Fusion, which aggregates documents based on their rank positions across multiple fields without requiring score normalization or manual weights. It applies the formula 1 / (k + rank) where k is the rank constant. WeightedReRanker normalizes raw similarity scores (such as L2 distances) to a common scale and applies user-defined weights to each field, giving explicit control over the influence of semantic versus lexical signals.
How do I filter results before applying vector similarity?
Pass a Boolean expression string to the filter parameter in Collection.query. The expression syntax supports comparison operators (==, !=, <, >, <=, >=) and logical operators (AND, OR, NOT) on scalar fields. The underlying C++ engine evaluates this filter to pre-select candidates before computing vector similarities, significantly improving both query performance and result relevance.
Can I use more than two vector fields in a hybrid search?
Yes. The vectors parameter in Collection.query accepts a list of VectorQuery objects, allowing you to query any number of fields simultaneously. You can combine multiple dense fields (such as title and description embeddings) with multiple sparse fields (such as BM25 on different text columns). The reranker will fuse all provided result lists into a single ranked output regardless of the number of fields.
What embedding models does zvec support for hybrid search?
The library provides DefaultLocalDenseEmbedding for sentence-transformer models (MiniLM-L6-v2 by default) producing dense vectors, and DefaultLocalSparseEmbedding for SPLADE-based sparse vectors, both in python/zvec/extension/sentence_transformer_embedding_function.py. For lexical sparse embeddings, BM25EmbeddingFunction is available in python/zvec/extension/bm25_embedding_function.py. The modular design allows you to implement custom embedding functions by adhering to the base interface.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →