How to Implement On-Demand Transformations in Feast: A Complete Guide
On-demand transformations in Feast let you compute features at request time using On-Demand Feature Views (ODFVs), which support pandas, Python native, or Substrait execution modes to transform source features dynamically.
On-demand transformations are a core capability of the feast-dev/feast repository that enable real-time feature engineering without pre-computing values in the offline store. By implementing On-Demand Feature Views (ODFVs), you can apply custom business logic to raw features at serving time, ensuring your models receive the most up-to-date calculations.
What Are On-Demand Feature Views?
An On-Demand Feature View (ODFV) is a special type of feature view in Feast that derives new features from existing sources at request time rather than materializing them beforehand. According to the source code in sdk/python/feast/on_demand_feature_view.py, the OnDemandFeatureView class inherits from BaseFeatureView and stores sources in two internal dictionaries: source_feature_view_projections and source_request_sources.
ODFVs are particularly useful when:
- Feature values depend on real-time request context (e.g., user location, current time)
- Computations are inexpensive and don't warrant pre-materialization
- Business logic changes frequently and must be decoupled from batch pipelines
Transformation Execution Modes
Feast supports three distinct execution modes for on-demand transformations, each implemented as a separate transformation class:
Pandas Mode
PandasTransformation (defined in sdk/python/feast/transformation/pandas_transformation.py) receives a pandas.DataFrame containing all source columns and returns a transformed DataFrame. This mode is ideal for batch-style transformations using vectorized pandas operations.
Python Native Mode
PythonTransformation (defined in sdk/python/feast/transformation/python_transformation.py) receives a Dict[str, Any] representing a single row and returns a transformed dictionary. When combined with the singleton=True flag, this mode executes per-row logic efficiently for lightweight computations like lookups or simple arithmetic.
Substrait Mode
SubstraitTransformation (defined in sdk/python/feast/transformation/substrait_transformation.py) uses Ibis expressions compiled to Substrait plans for high-performance push-down execution. This mode is optimal when your data source can execute transformations natively (e.g., Spark, BigQuery).
Core Architecture and Source Files
The implementation spans several key files in the Feast repository:
| File | Purpose |
|---|---|
sdk/python/feast/on_demand_feature_view.py |
Core OnDemandFeatureView class, validation logic (ensure_valid), transformation routing (transform_dict, transform_arrow), and the @on_demand_feature_view decorator |
sdk/python/feast/transformation/pandas_transformation.py |
PandasTransformation class implementing vectorized DataFrame operations |
sdk/python/feast/transformation/python_transformation.py |
PythonTransformation class for dict-based per-row processing |
sdk/python/feast/transformation/substrait_transformation.py |
SubstraitTransformation class for Ibis/Substrait plan execution |
The OnDemandFeatureView class validates configurations through methods like _validate_sources_config, _validate_transformation_config, _validate_singleton_config, and _validate_online_store_config to ensure runtime compatibility.
Step-by-Step Implementation
Define Input Sources
ODFVs build upon existing feature views or request data sources. Define your base features first:
from feast import FeatureView, Field, Float32
from feast.file_source import FileSource
source_fv = FeatureView(
name="raw_features",
entities=[],
schema=[Field("feature1", Float32), Field("feature2", Float32)],
source=FileSource(name="src", path="data.parquet"),
)
Write the Transformation Function
Choose your execution mode and implement the corresponding function signature:
For pandas mode:
import pandas as pd
def my_udf(df: pd.DataFrame) -> pd.DataFrame:
out = pd.DataFrame()
out["output1"] = df["feature1"] * 2
out["output2"] = df["feature2"] + 1
return out
For python native mode:
from typing import Dict, Any
def my_udf(row: Dict[str, Any]) -> Dict[str, Any]:
return {
"output1": row["feature1"] * 2,
"output2": row["feature2"] + 1,
}
Create the On-Demand Feature View
Option A: Using the decorator (recommended)
from feast import on_demand_feature_view, Field, Float32
import inspect
@on_demand_feature_view(
name="my_odfv",
sources=[source_fv],
schema=[
Field(name="output1", dtype=Float32),
Field(name="output2", dtype=Float32),
],
mode="pandas",
)
def double_udf(features_df):
out = pd.DataFrame()
out["output1"] = features_df["feature1"] * 2
out["output2"] = features_df["feature2"] + 1
return out
Option B: Direct construction
from feast import OnDemandFeatureView
from feast.transformation.pandas_transformation import PandasTransformation
odfv = OnDemandFeatureView(
name="my_odfv",
sources=[source_fv],
schema=[
Field(name="output1", dtype=Float32),
Field(name="output2", dtype=Float32),
],
feature_transformation=PandasTransformation(
udf=my_udf,
udf_string=inspect.getsource(my_udf),
),
mode="pandas",
)
Configure Optional Flags
Enhance your ODFV with additional configuration options:
write_to_online_store=True– Persists computed features to the online store for faster subsequent retrievals. Requires at least one entity defined in the sources.singleton=True– Only valid withmode="python"; processes one row at a time rather than batches.description,tags,owner– Metadata for governance and documentation.
Example with flags:
odfv = OnDemandFeatureView(
name="enriched_driving",
sources=[source_fv],
schema=[...],
feature_transformation=PythonTransformation(
udf=enrich_udf,
udf_string=inspect.getsource(enrich_udf),
),
mode="python",
singleton=True,
write_to_online_store=True,
)
Code Examples
Pandas Mode Example
The most common implementation uses vectorized pandas operations for batch transformations:
import pandas as pd
import inspect
from feast import on_demand_feature_view, Field, Float32
from feast.feature_view import FeatureView
from feast.file_source import FileSource
source_fv = FeatureView(
name="raw_features",
entities=[],
schema=[Field("feature1", Float32), Field("feature2", Float32)],
source=FileSource(name="src", path="data.parquet"),
)
@on_demand_feature_view(
name="double_features",
sources=[source_fv],
schema=[Field("output1", Float32), Field("output2", Float32)],
mode="pandas",
)
def double_udf(df: pd.DataFrame) -> pd.DataFrame:
"""Multiply `feature1` by 2 and add 1 to `feature2`."""
out = pd.DataFrame()
out["output1"] = df["feature1"] * 2
out["output2"] = df["feature2"] + 1
return out
The decorator implementation handles dill-based serialization so the function can be shipped to the Feature Server, as seen in sdk/python/feast/on_demand_feature_view.py.
Python Native (Singleton) Mode Example
For lightweight per-row logic, use the Python native mode with optional singleton processing:
from typing import Dict, Any
from feast import OnDemandFeatureView, PythonTransformation
from feast.feature_view import FeatureView
from feast.file_source import FileSource
from feast.field import Field, Float32, Int64
from feast.entity import Entity
import inspect
import datetime
source_fv = FeatureView(
name="raw_numeric",
entities=[Entity(name="driver_id", join_key="driver_id", value_type=Int64)],
schema=[Field("speed", Float32), Field("distance", Float32)],
source=FileSource(name="src", path="driving.parquet"),
)
def enrich_udf(row: Dict[str, Any]) -> Dict[str, Any]:
# singleton mode: `row` is a dict for a single entity
return {
"speed_mph": row["speed"] * 2.237,
"distance_km": row["distance"] / 1000,
"ingested_at": datetime.datetime.utcnow(),
}
odfv = OnDemandFeatureView(
name="enriched_driving",
sources=[source_fv],
schema=[
Field(name="speed_mph", dtype=Float32),
Field(name="distance_km", dtype=Float32),
Field(name="ingested_at", dtype=Timestamp),
],
feature_transformation=PythonTransformation(
udf=enrich_udf,
udf_string=inspect.getsource(enrich_udf),
),
mode="python",
singleton=True,
write_to_online_store=True, # persists the enriched rows
)
The singleton validation logic resides in OnDemandFeatureView._validate_singleton_config in sdk/python/feast/on_demand_feature_view.py, ensuring this flag is only used with Python native mode.
Substrait (Ibis) Mode Example
For high-performance push-down to data sources like Spark or BigQuery, use the Substrait mode with Ibis expressions:
import ibis
from feast import OnDemandFeatureView, SubstraitTransformation
from feast.feature_view import FeatureView
from feast.file_source import FileSource
from feast.field import Field, Float32
source_fv = FeatureView(
name="raw_sales",
entities=[],
schema=[Field("price", Float32), Field("quantity", Float32)],
source=FileSource(name="src", path="sales.parquet"),
)
# Ibis expression: compute revenue = price * quantity
def substrait_plan():
t = ibis.table([("price", "float32"), ("quantity", "float32")])
return (t.mutate(revenue=t.price * t.quantity)).select("revenue")
odfv = OnDemandFeatureView(
name="revenue_odfv",
sources=[source_fv],
schema=[Field(name="revenue", dtype=Float32)],
feature_transformation=SubstraitTransformation.from_ibis(
substrait_plan(),
sources=[source_fv],
),
mode="substrait",
)
The SubstraitTransformation.from_ibis method is implemented in sdk/python/feast/transformation/substrait_transformation.py, while the ODFV constructor routes the mode to the appropriate transformation via OnDemandFeatureView.get_feature_transformation.
Validation and Runtime Behavior
When you load a feature repository containing ODFVs, Feast validates configurations through the ensure_valid() method in sdk/python/feast/on_demand_feature_view.py. This method invokes several validation helpers:
_validate_sources_config: Ensures at least one source is defined_validate_transformation_config: Verifies the transformation mode matches the provided transformation type (pandas, python, or substrait)_validate_singleton_config: Confirmssingleton=Trueis only used withmode="python"_validate_online_store_config: Checks thatwrite_to_online_store=Truerequires at least one entity to be defined
At runtime, the Feature Server invokes transform_dict, transform_arrow, or transform_ibis methods on the OnDemandFeatureView instance, which route inputs to the underlying transformation object's transform method. For pandas mode, this converts Arrow tables to DataFrames; for python mode, this processes dictionaries; for substrait mode, this executes the Ibis plan against the source.
Summary
- On-Demand Feature Views (ODFVs) compute features at request time rather than storing them offline, enabling real-time feature engineering in the feast-dev/feast repository.
- Three execution modes are available: pandas (vectorized DataFrame operations), python (per-row dictionary processing with optional singleton mode), and substrait (Ibis-based push-down to data sources).
- Core implementation resides in
sdk/python/feast/on_demand_feature_view.py, with transformation logic split acrosspandas_transformation.py,python_transformation.py, andsubstrait_transformation.py. - Validation occurs via
ensure_valid()and specific_validate_*methods that check source presence, mode compatibility, singleton constraints, and online store write requirements. - Usage involves defining sources, writing UDFs with appropriate signatures, creating ODFVs via the
@on_demand_feature_viewdecorator or direct construction, and optionally enablingwrite_to_online_storefor persistence.
Frequently Asked Questions
What is the difference between pandas and python mode in Feast on-demand transformations?
Pandas mode uses PandasTransformation to process batches of data as pandas.DataFrame objects, enabling vectorized operations across multiple rows simultaneously. Python mode uses PythonTransformation to process data as dictionaries (Dict[str, Any]), handling one row at a time (or batches if singleton=False). According to the source code in sdk/python/feast/on_demand_feature_view.py, pandas mode is optimal for complex mathematical transformations across entities, while python mode suits lightweight per-row logic like unit conversions or simple lookups.
When should I use singleton mode in on-demand feature views?
Use singleton=True when you need per-row processing with Python native mode and want to ensure the UDF receives exactly one dictionary at a time rather than a batch. This is validated in OnDemandFeatureView._validate_singleton_config in sdk/python/feast/on_demand_feature_view.py, which enforces that singleton mode is only compatible with mode="python". Singleton mode is ideal for transformations requiring external API calls per entity or complex logic that doesn't vectorize well, though it may have higher overhead than batch processing.
Can on-demand feature views write results back to the online store?
Yes, by setting write_to_online_store=True when constructing an OnDemandFeatureView. This persists the computed features to the online store for faster subsequent retrievals. However, the validation logic in sdk/python/feast/on_demand_feature_view.py requires that at least one entity is defined in the sources when enabling this flag, as verified by _validate_online_store_config. This capability is particularly useful for expensive computations that should be cached after the first on-demand calculation.
How does Feast validate on-demand transformation configurations?
Feast validates ODFV configurations through the ensure_valid() method called during repository loading, which invokes four specific validation helpers in sdk/python/feast/on_demand_feature_view.py:
_validate_sources_config– Ensures at least one source feature view or request source is provided_validate_transformation_config– Verifies the transformation object type matches the declared mode (pandas, python, or substrait)_validate_singleton_config– Confirms singleton mode is only used with python mode_validate_online_store_config– Validates that writing to the online store requires entity definitions
Additionally, infer_features() can automatically populate the ODFV schema by running the UDF on synthetic data samples if the schema is omitted during definition.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →