Implementing Hybrid Search with Semantic Similarity and Filters in Alibaba zvec

Hybrid search in Alibaba zvec combines dense semantic vectors with sparse lexical vectors using multiple VectorQuery objects, pre-filters candidates via Boolean expressions, and fuses results through RrfReRanker or WeightedReRanker to return the most relevant documents.

Hybrid search merges the contextual understanding of dense embeddings with the precision of sparse term matching. In Alibaba's zvec vector database, this capability is exposed through the Collection.query interface in python/zvec/model/collection.py, which accepts multiple VectorQuery objects—each representing a different embedding space—while supporting scalar field filters to narrow the candidate pool before similarity scoring.

How Hybrid Search Works in zvec

The architecture follows a multi-stage pipeline. First, embedding functions convert the query text into dense and sparse vector representations. These vectors are packaged into VectorQuery objects that target specific fields in the collection. The Collection.query method dispatches these queries to the underlying engine, which retrieves candidates for each vector type. Finally, a reranker such as RrfReRanker or WeightedReRanker fuses the per-field result lists into a single ranked output.

Several classes in the zvec Python SDK work together to enable hybrid search:

Step-by-Step Implementation

Creating Dense and Sparse Embeddings

First, instantiate the embedding functions. The dense embedder uses sentence-transformers (MiniLM-L6-v2 by default), while the sparse embedder can use SPLADE or BM25.

from zvec.extension import (
    DefaultLocalDenseEmbedding,
    DefaultLocalSparseEmbedding,
)

# Dense semantic embedding

dense_emb = DefaultLocalDenseEmbedding()

# Sparse lexical embedding (SPLADE-based)

sparse_emb = DefaultLocalSparseEmbedding()

query_text = "machine learning algorithms for recommendation"
dense_vec = dense_emb.embed(query_text)    # Returns list[float]

sparse_vec = sparse_emb.embed(query_text)  # Returns dict[int, float]

Building VectorQuery Objects

Wrap each vector in a VectorQuery targeting its respective field in the collection schema.

from zvec.model.param.vector_query import VectorQuery

dense_q = VectorQuery(field_name="dense", vector=dense_vec)
sparse_q = VectorQuery(field_name="sparse", vector=sparse_vec)

Applying Scalar Filters

Use the filter parameter in Collection.query to pre-filter candidates using Boolean expressions on scalar fields before vector similarity is computed.


# Filter expression syntax supports ==, !=, <, >, <=, >=, AND, OR, NOT

filter_expr = "category == 'technology' AND publish_year >= 2023"

results = collection.query(
    vectors=[dense_q, sparse_q],
    topk=20,
    filter=filter_expr,
)

Fusing Results with Rerankers

When multiple VectorQuery objects are provided, the engine returns per-field candidate lists. Use WeightedReRanker for explicit weighting or RrfReRanker for rank-based fusion without manual tuning.

from zvec.extension import WeightedReRanker, RrfReRanker

# Option A: Weighted fusion (normalize scores and apply weights)

weighted_reranker = WeightedReRanker(
    topn=20,
    metric="L2",  # Must match the distance metric used by the index

    weights={"dense": 0.7, "sparse": 0.3},
)

# Option B: Reciprocal Rank Fusion (no weights needed)

rrf_reranker = RrfReRanker(topn=20, rank_constant=60)

results = collection.query(
    vectors=[dense_q, sparse_q],
    topk=20,
    filter=filter_expr,
    reranker=weighted_reranker,  # or rrf_reranker

)

Complete Code Example

Here is a complete implementation combining dense semantic search, sparse lexical search, scalar filtering, and weighted reranking:

from zvec import Collection
from zvec.model.param.vector_query import VectorQuery
from zvec.extension import (
    DefaultLocalDenseEmbedding,
    DefaultLocalSparseEmbedding,
    WeightedReRanker,
)

# Initialize collection and embedding functions

collection = Collection.load("product_catalog")
dense_emb = DefaultLocalDenseEmbedding()
sparse_emb = DefaultLocalSparseEmbedding()

# Encode query

query = "wireless noise cancelling headphones"
dense_vec = dense_emb.embed(query)
sparse_vec = sparse_emb.embed(query)

# Build vector queries targeting different fields

dense_q = VectorQuery(field_name="dense_desc", vector=dense_vec)
sparse_q = VectorQuery(field_name="sparse_desc", vector=sparse_vec)

# Define scalar filter for pre-filtering

filter_expr = "category == 'electronics' AND price < 300"

# Configure weighted reranker (70% semantic, 30% lexical)

reranker = WeightedReRanker(
    topn=10,
    metric="L2",
    weights={"dense_desc": 0.7, "sparse_desc": 0.3},
)

# Execute hybrid search

results = collection.query(
    vectors=[dense_q, sparse_q],
    topk=10,
    filter=filter_expr,
    reranker=reranker,
    include_vector=False,
)

# Display results

for doc in results:
    print(f"ID: {doc.id}, Score: {doc.score:.4f}")

Key Source Files

The hybrid search implementation relies on these specific source files in the Alibaba zvec repository:

Summary

  • Hybrid search in zvec combines dense semantic vectors with sparse lexical vectors to balance contextual understanding with exact term matching.
  • Use VectorQuery objects to target different fields in a single Collection.query() call, as implemented in python/zvec/model/collection.py.
  • Apply scalar filters via the filter parameter using Boolean expressions to pre-filter candidates before vector similarity scoring.
  • Fuse multi-field results using RrfReRanker (Reciprocal Rank Fusion) or WeightedReRanker (score-based weighting) from python/zvec/extension/multi_vector_reranker.py.
  • The modular architecture supports swapping embedding backends—including DefaultLocalDenseEmbedding, DefaultLocalSparseEmbedding, and BM25EmbeddingFunction—without changing the core search logic.

Frequently Asked Questions

What is the difference between RrfReRanker and WeightedReRanker?

RrfReRanker uses Reciprocal Rank Fusion, which aggregates documents based on their rank positions across multiple fields without requiring score normalization or manual weights. It applies the formula 1 / (k + rank) where k is the rank constant. WeightedReRanker normalizes raw similarity scores (such as L2 distances) to a common scale and applies user-defined weights to each field, giving explicit control over the influence of semantic versus lexical signals.

How do I filter results before applying vector similarity?

Pass a Boolean expression string to the filter parameter in Collection.query. The expression syntax supports comparison operators (==, !=, <, >, <=, >=) and logical operators (AND, OR, NOT) on scalar fields. The underlying C++ engine evaluates this filter to pre-select candidates before computing vector similarities, significantly improving both query performance and result relevance.

Yes. The vectors parameter in Collection.query accepts a list of VectorQuery objects, allowing you to query any number of fields simultaneously. You can combine multiple dense fields (such as title and description embeddings) with multiple sparse fields (such as BM25 on different text columns). The reranker will fuse all provided result lists into a single ranked output regardless of the number of fields.

The library provides DefaultLocalDenseEmbedding for sentence-transformer models (MiniLM-L6-v2 by default) producing dense vectors, and DefaultLocalSparseEmbedding for SPLADE-based sparse vectors, both in python/zvec/extension/sentence_transformer_embedding_function.py. For lexical sparse embeddings, BM25EmbeddingFunction is available in python/zvec/extension/bm25_embedding_function.py. The modular design allows you to implement custom embedding functions by adhering to the base interface.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →