internals

How ZVec's SQL Query Engine Works: From Filter Strings to Arrow Execution Plans

February 16, 2026 alibaba/zvec ↗

ZVec's SQL query engine compiles high-level vector search requests into optimized Apache Arrow compute graphs through a three-stage pipeline: parsing filters into SQLInfo trees, analyzing and rewriting into QueryInfo with intelligent index selection, and planning Arrow Acero execution nodes for efficient ANN search.

The alibaba/zvec repository implements a high-performance vector database that processes SQL-like queries using Apache Arrow's compute engine. Understanding how ZVec's SQL query engine works reveals how the system optimizes vector similarity search with complex predicate filtering.

Parsing and Normalization: From Text to SQLInfo

The query processing begins in src/db/sqlengine/parser/zvec_parser.cc, where the ZVecParser class converts textual filter strings into structured representations.

When a user provides a filter like "age > 30 AND title LIKE '%engineer%'", the parser invokes ANTLR-generated grammar rules defined in zvec_sql_parser.h to build an abstract syntax tree (AST). The ZVecParser::parse_filter() method normalizes literals—trimming quotes, handling numeric conversions, and validating syntax.

The AST is then transformed into a SQLInfo object via SQLInfoHelper::MessageToSQLInfo. This object records the SQL type (SELECT, INSERT, etc.) and maintains a pointer to the top-level base info (typically a SelectInfo structure). This normalized form serves as the canonical input for the analysis phase.

Analysis and Rewriting: Optimizing the Query Plan

The second stage occurs in src/db/sqlengine/analyzer/query_analyzer.cc, where the QueryAnalyzer class transforms SQLInfo into QueryInfo—a rich structure containing vector conditions, filter conditions, forward conditions, invert conditions, order-by clauses, and top-k parameters.

The QueryAnalyzer::analyze() method orchestrates several critical transformations:

Vector condition extraction via check_and_convert_vector: This verifies the vector field exists, extracts dense or sparse vector text, and populates QueryInfo::QueryVectorCondInfo with the query vector and search parameters.
QueryNode tree construction via create_querynode_from_node: This converts the SelectInfo tree into an executable QueryNode tree representing the logical query plan.
Filter-vs-Invert decision via decide_filter_index_cond: This rule-based optimizer inspects every predicate to determine execution strategy:
- Invertible predicates (equality matches on indexed fields) become invert-cond candidates, leveraging inverted indexes for candidate narrowing.
- Forward-filter predicates scan raw forward fields when no suitable index exists.
- Post-filter predicates apply after vector search when post_filter_topk is configured.

The analyzer rejects unsupported constructs early—such as OR ancestry on vector clauses—preventing runtime failures.

Planning and Execution: Building the Arrow Compute Pipeline

The final stage in src/db/sqlengine/planner/query_planner.cc converts QueryInfo into PlanInfo—a tree of Apache Arrow compute operators executable by Arrow Acero.

The QueryPlanner::make_plan() method constructs the execution graph through these steps:

Scan node selection: Based on QueryInfo flags, the planner chooses between:
- VectorRecallNode: Executes approximate nearest-neighbor (ANN) algorithms like HNSW or IVF on the vector field.
- InvertRecallNode: Uses inverted indexes to narrow candidate sets before vector search.
- SegmentNode: Performs forward scans on raw segment data when indexes are unavailable.
Expression compilation: The planner builds Arrow cp::Expression objects from the filter tree via create_filter_node. These compile into Arrow kernels—such as is_in, list_value_length, and custom contain_all/any operations—and attach to scan nodes.

Pipeline construction: The final execution graph forms an Acero pipeline:


SegmentNode → (optional) InvertRecallNode → VectorRecallNode → FilterOps → FetchVectorOp → Project

Execution: PlanInfo::execute_to_reader() launches the pipeline, returning an Arrow RecordBatchReader that streams results as RecordBatch objects.

Result Materialization

After execution, SQLEngineImpl::fill_result iterates over the RecordBatchReader, allocating a Doc object for each row. Type-specific helpers like fill_doc_vector<float> and fill_doc_field<arrow::Int64Array> copy Arrow column data into ZVec's internal Doc representation.

The materialization process attaches doc-id, score, user-id, and any selected fields, ultimately returning a DocPtrList to the caller.

Code Examples

Python: Simple Vector Search

import zvec
from zvec import CollectionSchema, VectorSchema, DataType, HnswQueryParam

# 1. Initialise the library (once per process)

zvec.init(log_type=zvec.LogType.CONSOLE, log_level=zvec.LogLevel.INFO)

# 2. Define a collection schema with a 128-dimensional FP32 vector field

schema = CollectionSchema(
    name="my_collection",
    vectors=[VectorSchema(name="emb", dimension=128,
                         data_type=DataType.VECTOR_FP32)]
)

# 3. Open the collection (assumes it already exists)

coll = zvec.open("./my_collection", schema)

# 4. Build a VectorQuery (dense FP32 vector)

query = zvec.VectorQuery(
    field_name="emb",
    vector=[0.1] * 128,               # 128-dim vector

    param=HnswQueryParam(k=10)        # top-10 ANN

)

# 5. Execute the query

results = coll.query(query)            # returns List[Doc]

for doc in results:
    print(f"doc_id={doc.doc_id}, score={doc.score}")

Behind the scenes, coll.query invokes the C++ SQLEngineImpl::execute, which runs the three-stage pipeline described above.

Python: SQL-like Filter with Vector Search


# Same collection as before

filter_str = "age >= 30 AND title LIKE '%engineer%'"
query = zvec.VectorQuery(
    field_name="emb",
    vector=[0.2] * 128,
    filter=filter_str,                 # textual filter gets parsed by ZVecParser

    param=HnswQueryParam(k=5)
)

results = coll.query(query)

# The filter is transformed into Arrow expressions and applied

# before/after the ANN search depending on index availability.

C++: Direct Engine Usage

#include <zvec/db/sqlengine/sqlengine.h>
#include <zvec/db/doc.h>

using namespace zvec::sqlengine;

int main() {
    // 1. Create engine (profiler is optional)
    auto engine = SQLEngine::create(nullptr);

    // 2. Prepare collection schema & vector query
    CollectionSchema::Ptr coll = ...;          // obtained from DB metadata
    VectorQuery query;
    query.field_name_ = "emb";
    query.query_vector_ = std::make_shared<std::vector<float>>(128, 0.3f);
    query.topk_ = 10;
    query.filter_ = "category = 'books'";

    // 3. Load segments (each segment = a data file)
    std::vector<Segment::Ptr> segs = LoadSegments(...);

    // 4. Execute
    auto res = engine->execute(coll, query, segs);
    if (!res) { /* handle error */ }
    for (auto &doc_ptr : res.value()) {
        std::cout << "doc_id=" << doc_ptr->doc_id()
                  << " score=" << doc_ptr->score() << "\n";
    }
}

Key Source Files

File	Purpose
`src/db/sqlengine/sqlengine.h`	Abstract `SQLEngine` interface
`src/db/sqlengine/sqlengine_impl.h`	Concrete implementation (`SQLEngineImpl`)
`src/db/sqlengine/sqlengine_impl.cc`	Core orchestration: parsing, planning, result materialisation
`src/db/sqlengine/parser/zvec_parser.cc`	ANTLR-based filter parser → `SQLInfo`
`src/db/sqlengine/analyzer/query_analyzer.cc`	Transforms `SQLInfo` into `QueryInfo`, decides index/forward/post filters
`src/db/sqlengine/planner/query_planner.h` & `.cc`	Generates Arrow execution plan (`PlanInfo`)
`src/db/sqlengine/planner/vector_recall_node.h`	Vector ANN recall node (HNSW, IVF, etc.)
`src/db/sqlengine/planner/invert_recall_node.h`	Inverted-index based candidate narrowing
`src/db/sqlengine/planner/segment_node.h`	Reads a segment and produces Arrow record batches
`src/db/sqlengine/planner/ops/*`	Arrow compute operators for `IN`, `LIKE`, `CONTAIN` etc.

These files together constitute the SQL query engine that turns a textual filter and a vector query into an optimized Arrow compute pipeline, enabling fast ANN search with optional predicate push-down.

Summary

ZVec's SQL query engine processes vector search requests through a three-stage pipeline: parsing, analysis, and planning/execution.
The ZVecParser in zvec_parser.cc converts filter strings into SQLInfo trees using ANTLR-generated grammars.
The QueryAnalyzer in query_analyzer.cc transforms SQLInfo into QueryInfo, extracting vector conditions and deciding between inverted-index filters, forward scans, and post-filters.
The QueryPlanner in query_planner.cc builds a PlanInfo execution graph using Arrow Acero operators, combining VectorRecallNode, InvertRecallNode, and SegmentNode into a streaming pipeline.
Results are materialized from Arrow RecordBatchReader into ZVec's Doc objects via type-specific helpers.

Frequently Asked Questions

How does ZVec parse SQL-like filter strings?

ZVec uses an ANTLR-generated grammar defined in src/db/sqlengine/parser/zvec_parser.cc to tokenize and parse filter strings. The ZVecParser class converts the raw text into an abstract syntax tree (AST), then normalizes literals and builds a SQLInfo object that represents the query structure. This process handles complex expressions including LIKE, IN, and logical operators while validating syntax.

What is the difference between invert-cond and forward-filter in ZVec?

Invert-cond predicates are those that can leverage ZVec's inverted indexes, typically equality matches on indexed fields that narrow the candidate set before vector search. Forward-filter predicates scan raw forward field data when no suitable index exists, reading values directly from segments. The QueryAnalyzer in query_analyzer.cc makes this decision via decide_filter_index_cond, routing predicates to the most efficient execution path based on index availability and predicate type.

How does ZVec execute vector similarity search?

Vector similarity search executes through the VectorRecallNode defined in src/db/sqlengine/planner/vector_recall_node.h. During the planning phase, the QueryPlanner instantiates this node with the query vector and ANN parameters (such as HNSW or IVF configurations). At execution time, Arrow Acero streams data through the node, which performs approximate nearest neighbor search against the indexed vector segments, returning top-k candidates with similarity scores.

Can I use ZVec's SQL engine directly from C++?

Yes, the C++ API in src/db/sqlengine/sqlengine.h exposes the SQLEngine interface for direct integration. You can create an engine instance via SQLEngine::create(), prepare a VectorQuery with field names, query vectors, and filter strings, then call engine->execute() with your collection schema and segment list. This returns a DocPtrList containing document IDs, scores, and field values without requiring the Python wrapper.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how alibaba/zvec works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →