How to Use Feast's Vector Search Functionality for Embedding-Based Retrieval
Feast enables embedding-based retrieval by allowing you to mark FeatureView columns with vector_index=True, write dense vectors to supported online stores like PgVector or Milvus, and query for nearest neighbors using retrieve_online_documents_v2 or the FeastVectorStore wrapper.
Feast (feast-dev/feast) is an open-source feature store that now supports vector search functionality for embedding-based retrieval. This capability allows machine learning teams to store dense embeddings alongside traditional features and perform similarity search at low latency during inference.
Defining a Feature View with Vector Indexing
To enable vector search, you must define a FeatureView with a field marked for vector indexing. In sdk/python/feast/vector_store.py, Feast detects fields where vector_index=True and creates the appropriate ANN index in the configured online store.
Set vector_search_metric to specify the distance function—common options include COSINE, L2 (Euclidean), or INNER_PRODUCT.
from datetime import timedelta
from feast import Entity, FeatureView, Field, FileSource
from feast.types import Float32, Array, String
from feast.data_format import ParquetFormat
product = Entity(name="product_id", join_keys=["id"])
source = FileSource(
file_format=ParquetFormat(),
path="data/sample_data.parquet",
timestamp_field="event_timestamp",
)
product_embeddings = FeatureView(
name="product_embeddings",
entities=[product],
ttl=timedelta(days=30),
schema=[
Field(
name="embedding",
dtype=Array(Float32),
vector_index=True,
vector_search_metric="L2"
),
Field(name="name", dtype=String),
Field(name="description", dtype=String),
],
source=source,
online=True,
)
After defining the view, run feast apply to create the vector index in the online store.
Writing Embeddings to the Online Store
Once the feature view is registered, populate the online store with embedding vectors. You can materialize historical data or write new embeddings directly using write_to_online_store.
As shown in examples/online_store/pgvector_tutorial/pgvector_example.py, generate embeddings using your preferred model (e.g., SentenceTransformers), then write the DataFrame to Feast:
import pandas as pd
from sentence_transformers import SentenceTransformer
from feast import FeatureStore
# Load data
df = pd.read_parquet("data/sample_data.parquet")
# Generate embeddings
model = SentenceTransformer("all-MiniLM-L6-v2")
df["embedding"] = model.encode(df["description"].tolist()).tolist()
# Write to online store
store = FeatureStore(repo_path=".")
store.write_to_online_store(
feature_view_name="product_embeddings",
df=df[["id", "embedding", "name", "description"]],
)
Feast forwards these vectors to the underlying vector database (PgVector, Milvus, etc.), which builds the ANN index for fast retrieval.
Querying for Similar Vectors
Feast provides two APIs for embedding-based retrieval: the low-level SDK method retrieve_online_documents_v2 and the high-level FeastVectorStore wrapper.
Using the Low-Level SDK
The retrieve_online_documents_v2 method in sdk/python/feast/feature_store.py accepts a query vector, feature references, and top_k to return the nearest neighbors.
import numpy as np
from sentence_transformers import SentenceTransformer
from feast import FeatureStore
# Prepare query
query_text = "wireless headphones with great sound"
model = SentenceTransformer("all-MiniLM-L6-v2")
query_vec = model.encode([query_text])[0]
# Retrieve similar documents
store = FeatureStore(repo_path=".")
result = store.retrieve_online_documents_v2(
features=[
"product_embeddings:embedding",
"product_embeddings:name",
"product_embeddings:description",
],
query=query_vec.tolist(),
top_k=5,
distance_metric="L2",
)
# Convert to DataFrame
df_result = result.to_df()
print(df_result[["product_embeddings:name", "product_embeddings:description"]])
Using the FeastVectorStore Wrapper
For cleaner application code, use the FeastVectorStore class from sdk/python/feast/vector_store.py. This wrapper initializes the feature store, manages the RAG view configuration, and exposes a simple query method.
from feast.vector_store import FeastVectorStore
from feast import FeatureView
# Initialize wrapper
vector_store = FeastVectorStore(
repo_path=".",
rag_view=product_embeddings, # FeatureView defined earlier
features=[
"product_embeddings:embedding",
"product_embeddings:name",
"product_embeddings:description",
],
)
# Query
result = vector_store.query(query_vector=query_vec, top_k=5)
df = result.to_df()
print(df[["product_embeddings:name", "product_embeddings:description"]])
Supported Vector Database Backends
Feast abstracts multiple vector databases behind a unified online store interface. Configure your backend in feature_store.yaml under the online_store section.
- PgVector: PostgreSQL with vector extension. Set
type: pgvector,vector_enabled: true, andembedding_dim. - Milvus: Dedicated vector database. Set
type: milvus. - Elasticsearch: Set
type: elasticsearchwith vector capabilities. - Qdrant: Set
type: qdrant. - SQLite: Experimental support via
sqlite_vecextension.
Each backend implements the vector search protocol in sdk/python/feast/infra/online_stores/, such as postgres_online_store/postgres.py for PgVector or milvus_online_store/milvus.py for Milvus.
Summary
- Feast's vector search functionality enables embedding-based retrieval by marking FeatureView fields with
vector_index=Trueandvector_search_metric. - The workflow involves three steps: defining the vector-enabled feature view, writing embeddings via
write_to_online_store, and querying withretrieve_online_documents_v2orFeastVectorStore. - Supported backends include PgVector, Milvus, Elasticsearch, Qdrant, and SQLite, each configured in
feature_store.yaml. - Distance metrics such as
L2,COSINE, andINNER_PRODUCTare supported depending on the backend.
Frequently Asked Questions
What distance metrics are supported for vector search in Feast?
Feast supports L2 (Euclidean distance), COSINE (cosine similarity), and INNER_PRODUCT (dot product) through the vector_search_metric parameter in the Field definition. The availability of specific metrics depends on the underlying vector database backend configured in your feature_store.yaml.
Can I use Feast vector search with multiple vector columns in the same feature view?
Yes, you can define multiple fields with vector_index=True within a single FeatureView. Each vector column can have its own vector_search_metric. When querying, specify which vector feature you want to search against in the features list passed to retrieve_online_documents_v2.
How do I choose between retrieve_online_documents_v2 and FeastVectorStore?
Use retrieve_online_documents_v2 when you need fine-grained control over the query parameters or are integrating vector search into existing Feast workflows. Use FeastVectorStore from sdk/python/feast/vector_store.py for a simplified RAG-style interface that abstracts feature store initialization and provides a cleaner API for application developers.
Which online stores support vector indexing in Feast?
Feast supports vector indexing in PostgreSQL with PgVector (type: pgvector), Milvus (type: milvus), Elasticsearch (type: elasticsearch), Qdrant (type: qdrant), and SQLite with vector extensions (type: sqlite). Each backend implements the vector search protocol in its respective module under sdk/python/feast/infra/online_stores/.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →