BatchFeatureView vs FeatureView vs StreamFeatureView vs OnDemandFeatureView in Feast
BatchFeatureView processes historical batch data only, StreamFeatureView ingests real-time event streams with optional windowed aggregations, OnDemandFeatureView computes features lazily at request time from upstream views, and FeatureView serves as the abstract base class powering batch and stream implementations.
The feast-dev/feast repository models feature data through four distinct view types, each optimized for specific data source topologies and access patterns. Understanding the architectural differences between these classes is critical for building feature pipelines that balance materialization latency, computational overhead, and storage costs.
The Base FeatureView Class
FeatureView is the abstract foundation that defines the core schema for all feature groups in Feast. Located in sdk/python/feast/feature_view.py at lines 65-78, this base class holds the feature schema, associated entities, time-to-live (TTL) settings, and source references. While the base FeatureView class can theoretically reference either batch or stream sources, concrete implementations enforce specific source types through validation logic in their subclasses.
BatchFeatureView: Batch-Only Processing
BatchFeatureView extends FeatureView to enforce batch-only data sources such as BigQuery, FileSource, Redshift, Snowflake, Spark, Trino, or Athena. In sdk/python/feast/batch_feature_view.py (lines 20-28), the class validates that the provided source appears in SUPPORTED_BATCH_SOURCES, raising an error for stream sources. The class definition at lines 31-47 adds a batch_engine configuration parameter that controls the processing backend for offline materialization.
BatchFeatureView supports both online and offline materialization through the online and offline boolean flags. When online=True, the view writes feature values to the online store during materialization, enabling low-latency serving for batch-computed features.
from feast import BatchFeatureView, Entity, Field, ValueType, FileSource
from datetime import timedelta
entity = Entity(name="driver_id", join_keys=["driver_id"], value_type=ValueType.STRING)
driver_stats = BatchFeatureView(
name="driver_stats",
entities=[entity],
source=FileSource(path="gs://my-bucket/driver_stats.parquet"),
ttl=timedelta(days=30),
online=True,
)
StreamFeatureView: Real-Time Ingestion
StreamFeatureView handles continuous event ingestion from Kafka or PushSource while maintaining a mandatory batch source for historical backfills. According to sdk/python/feast/stream_feature_view.py (lines 33-41), the constructor validates sources against SUPPORTED_STREAM_SOURCES and verifies that the stream source contains a batch_source attribute. The class definition (lines 45-73) and docstring (lines 44-66) expose additional parameters including aggregations, timestamp_field, stream_engine, and tiling settings.
Unlike BatchFeatureView, StreamFeatureView supports windowed aggregations (e.g., tumbling or sliding window sums) computed during ingestion. When aggregations are defined, the timestamp_field parameter becomes required (lines 60-63). This enables real-time feature engineering on streaming data before writing to the online store.
from feast import StreamFeatureView, Aggregation, Field, ValueType, KafkaSource, FileSource
from datetime import timedelta
kafka_src = KafkaSource(
topic="driver_events",
bootstrap_servers="kafka:9092",
message_format=KafkaSource.MessageFormat.JSON,
timestamp_field="event_timestamp",
batch_source=FileSource(path="gs://my-bucket/driver_events_parquet")
)
driver_events = StreamFeatureView(
name="driver_events",
source=kafka_src,
aggregations=[
Aggregation(feature="distance_miles", func="sum", time_window=timedelta(hours=1))
],
timestamp_field="event_timestamp",
mode="python",
online=True,
)
OnDemandFeatureView: Request-Time Computation
OnDemandFeatureView operates independently from the FeatureView inheritance hierarchy (it inherits from BaseFeatureView) and computes features at read time rather than during materialization. As implemented in sdk/python/feast/on_demand_feature_view.py (lines 13-46), this view type does not hold a DataSource. Instead, it aggregates FeatureViewProjection objects and RequestSource instances through the _add_source_to_collections method (lines 60-78).
The transformation logic is built explicitly in get_feature_transformation() (lines 87-97), supporting three execution modes: pandas, python, or substrait. A distinctive capability is the write_to_online_store flag (lines 44-48), which enables caching computed results to the online store for subsequent reads. This is useful for expensive computations that should be computed once and then served from cache.
from feast import on_demand_feature_view, RequestSource, FeatureViewProjection, Field, ValueType
import pandas as pd
request_src = RequestSource(
name="user_profile_req",
schema=[Field(name="user_id", dtype=ValueType.STRING)]
)
def enrich_user(features: pd.DataFrame) -> pd.DataFrame:
return features.merge(user_profile_df, on="user_id", how="left")
user_enriched = on_demand_feature_view(
name="user_enriched",
sources=[request_src, FeatureViewProjection.from_feature_view(user_profile_view)],
schema=[
Field(name="age", dtype=ValueType.INT64),
Field(name="country", dtype=ValueType.STRING)
],
mode="pandas",
write_to_online_store=True,
)(enrich_user)
Key Architectural Differences
Source Validation Logic
- BatchFeatureView: Validates against
SUPPORTED_BATCH_SOURCES(BigQuery, FileSource, etc.) insdk/python/feast/batch_feature_view.py(lines 20-28). - StreamFeatureView: Validates against
SUPPORTED_STREAM_SOURCES(KafkaSource,PushSource) and requires abatch_sourcefor backfills insdk/python/feast/stream_feature_view.py(lines 33-41). - OnDemandFeatureView: Holds no
DataSource; instead aggregates multiple upstream views viaFeatureViewProjectionand request-time inputs viaRequestSource(lines 60-78).
Transformation Handling
- BatchFeatureView and StreamFeatureView inherit generic
feature_transformationlogic fromFeatureView, wrapping UDFs based on the chosenmode(PYTHON, PANDAS) as shown inbatch_feature_view.py(lines 47-55) andstream_feature_view.py(lines 91-104). - OnDemandFeatureView explicitly constructs transformations in
get_feature_transformation()and validates mode compatibility with concreteTransformationsubclasses (lines 87-97).
Materialization Behavior
- BatchFeatureView and StreamFeatureView: Materialized ahead of time through offline jobs or streaming ingestion. Both expose
onlineandofflineflags to control store participation. - OnDemandFeatureView: Never materialized during batch jobs; executes
transform_dictortransform_arrowmethods at request time.
Aggregation Support
Only StreamFeatureView supports the aggregations parameter for windowed computations (lines 60-63). Batch and on-demand views consume pre-aggregated data from their upstream sources.
Summary
- FeatureView serves as the abstract base class holding schema, entities, and TTL definitions in
sdk/python/feast/feature_view.py. - BatchFeatureView enforces batch-only sources (BigQuery, Snowflake, FileSource) and adds
batch_engineconfiguration for offline materialization. - StreamFeatureView requires both a stream source (Kafka/PushSource) and a batch source for backfills, supporting real-time aggregations and
stream_enginesettings. - OnDemandFeatureView inherits from
BaseFeatureView, computes features at request time from other views and request sources, and optionally writes results to the online store viawrite_to_online_store.
Frequently Asked Questions
Can OnDemandFeatureView consume features from BatchFeatureView or StreamFeatureView?
Yes. OnDemandFeatureView aggregates FeatureViewProjection objects that can reference any materialized view type, including BatchFeatureView and StreamFeatureView. According to sdk/python/feast/on_demand_feature_view.py (lines 60-78), the _add_source_to_collections method handles both view projections and RequestSource objects, enabling joins between batch-computed features and real-time request data at serving time.
Does StreamFeatureView always require a batch source?
Yes. The StreamFeatureView constructor explicitly validates that the stream source contains a batch_source attribute in sdk/python/feast/stream_feature_view.py (lines 33-41). This requirement ensures historical backfills can populate the offline store with stream data that arrived before the streaming pipeline was active, maintaining consistency between training and serving environments.
Which view type supports windowed aggregations like "sum over last hour"?
Only StreamFeatureView supports the aggregations parameter for windowed feature engineering. As defined in sdk/python/feast/stream_feature_view.py (lines 60-63), you can specify aggregation functions (sum, count, etc.) over time windows during stream ingestion. BatchFeatureView and OnDemandFeatureView rely on upstream data sources to provide already-aggregated values or perform aggregations in their transformation UDFs.
Can OnDemandFeatureView write computed features to the online store?
Yes. Unlike other view types that materialize through scheduled jobs, OnDemandFeatureView includes a write_to_online_store flag in sdk/python/feast/on_demand_feature_view.py (lines 44-48). When enabled, the results of the on-demand transformation are persisted to the online store after the first computation, effectively caching expensive calculations for subsequent identical requests.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →