What’s the Difference Between Incremental and Full Materialization in Feast?

Full materialization copies a user-specified time window of historical features to the online store, while incremental materialization automatically detects and copies only new data since the last run, making it the preferred choice for scheduled production pipelines.

Materialization is the critical batch process that moves feature data from Feast’s offline store to the low-latency online store used for real-time inference. In the feast-dev/feast repository, users can choose between two distinct strategies when working with incremental and full materialization: full materialization for backfilling and rebuilding, or incremental materialization for efficient, routine updates. Understanding these modes ensures you optimize compute costs and data freshness in your feature pipelines.

What Is Materialization in Feast?

Feast stores feature values in two distinct layers:

  • Offline store – the source of truth for historic data (e.g., Snowflake, BigQuery, Redshift).
  • Online store – a low-latency key-value store (e.g., Redis, DynamoDB, Datastore) used at inference time.

Materialization is the batch job that copies data from the offline store into the online store. It is implemented in sdk/python/feast/feature_store.py and invoked via the Python API or the Feast CLI.

Full Materialization Explained

How Full Materialization Works

In FeatureStore.materialize (lines 33‑84 of sdk/python/feast/feature_store.py), you explicitly pass a start_date and end_date. Feast reads all rows that fall inside that window from the offline store, regardless of whether they have been copied before, and writes them to the online store.

After the job finishes, registry.apply_materialization records the exact interval that was processed (start_date → end_date). This is useful for auditability but does not affect the next full materialization run.

When to Use Full Materialization

  • Re-building the whole online store after a schema change, store migration, or infrastructure failure.
  • Back-filling a long historic window when you need to populate the online store with months or years of data.
  • One-off jobs where you know the exact interval you need and want complete control over the time window.

Full Materialization Code Examples

Python API:

from feast import FeatureStore
from datetime import datetime, timedelta

fs = FeatureStore(repo_path="project/feature_repo")

# Materialize everything from 3 hours ago up to 10 minutes ago

fs.materialize(
    start_date=datetime.utcnow() - timedelta(hours=3),
    end_date=datetime.utcnow() - timedelta(minutes=10),
)

Source: FeatureStore.materialize implementation in sdk/python/feast/feature_store.py【/cache/repos/github.com/feast-dev/feast/master/sdk/python/feast/feature_store.py#L33-L84】.

CLI:


# Materialize a concrete window for all feature views

feast materialize \
  --start 2024-01-01T00:00:00Z \
  --end   2024-01-31T23:59:59Z

Source: CLI implementation in sdk/python/feast/cli/cli.py and documented in docs/reference/feast-cli-commands.md.

Incremental Materialization Explained

How Incremental Materialization Works

In FeatureStore.materialize_incremental (lines 1596‑1732 of sdk/python/feast/feature_store.py), you provide only an end_date. Feast internally computes the start_date by looking up feature_view.most_recent_end_time stored in the registry.

If the view has no previous materialization, the start is derived from the view’s TTL (time-to-live) or defaults to 52 weeks if TTL == 0. Only rows newer than that computed start point are read from the offline store and written to the online store.

After completion, registry.apply_materialization records the incremental interval, which becomes the new "most recent end time" for the next run.

When to Use Incremental Materialization

  • Routine "keep-the-online-store-up-to-date" jobs scheduled every hour, day, or week.
  • Minimizing compute and I/O costs by ingesting only the new data that arrived since the last run.
  • Production pipelines where you want automation without manually calculating time windows.

Incremental Materialization Code Examples

Python API:

from feast import FeatureStore
from datetime import datetime

fs = FeatureStore(repo_path="project/feature_repo")

# Load any new data that arrived up to "now"

fs.materialize_incremental(end_date=datetime.utcnow())

Source: FeatureStore.materialize_incremental implementation in sdk/python/feast/feature_store.py【/cache/repos/github.com/feast-dev/feast/master/sdk/python/feast/feature_store.py#L1596-L1732】.

CLI:


# Incrementally materialize the most recent data for all feature views

feast materialize-incremental --end "$(date -Iseconds)"

# Optional: limit to selected views

feast materialize-incremental --end "$(date -Iseconds)" --feature-views user_features transaction_features

Source: CLI wrappers in sdk/python/feast/cli/cli.py and documented in docs/reference/feast-cli-commands.md【/cache/repos/github.com/feast-dev/feast/master/docs/reference/feast-cli-commands.md#L27-L29】.

Key Differences Between Incremental and Full Materialization

Aspect Full Materialization Incremental Materialization
Time window User-specified start_dateend_date. Reads all rows in the interval regardless of prior runs. Automatically computed start (last recorded end time or TTL-based fallback) to user-provided end_date. Reads only new rows.
Registry usage Records the processed interval via registry.apply_materialization but does not use history to calculate the next window. Consults feature_view.most_recent_end_time from the registry to determine the start date for the next run.
Compute cost Higher – proportional to the size of the specified window. Lower – proportional only to the volume of new data since the last run.
Use case Back-filling, rebuilding online store, schema migrations, one-off historical loads. Scheduled production jobs, keeping online store fresh, minimizing I/O costs.
CLI command feast materialize --start <ISO> --end <ISO> feast materialize-incremental --end <ISO>

Both methods share the same underlying provider implementation (materialize_single_feature_view) and emit OpenLineage events (START, COMPLETE, FAIL) for observability, as implemented in sdk/python/feast/feature_store.py.

Summary

  • Full materialization requires you to specify both start and end dates, copying every row in that window to the online store regardless of previous runs. Use it for back-filling, rebuilding after schema changes, or when you need precise control over the time range.
  • Incremental materialization requires only an end date; Feast automatically calculates the start based on the last recorded materialization time (or the view’s TTL), copying only new data. Use it for scheduled production jobs to minimize compute costs and keep the online store continuously updated.
  • Both strategies update the registry via apply_materialization and share the same core provider logic in sdk/python/feast/feature_store.py, differing only in how the start date is determined.

Frequently Asked Questions

Can I switch between incremental and full materialization for the same feature view?

Yes. The registry tracks materialization history independently of the method used. You can run a full materialization to back-fill historical data, then switch to incremental materialization for routine updates. The incremental job will use the most recent end time recorded by whichever method last ran, as stored in feature_view.most_recent_end_time.

How does Feast track the last materialization timestamp?

Feast stores the "most recent end time" in the feature registry. When an incremental job runs, it queries feature_view.most_recent_end_time to determine the start of the next window. After any materialization job completes, registry.apply_materialization persists the processed interval, updating this timestamp for future incremental runs.

What happens if an incremental materialization job fails mid-run?

Because materialization is atomic per feature view, a failure during the batch write leaves the online store unchanged for the views that failed, while successfully processed views are updated and recorded in the registry. The next incremental run will attempt to materialize from the last successful end time, ensuring no gaps are created, though you may need to run a full materialization if you suspect data corruption or partial writes.

Does incremental materialization support all offline stores?

Yes. Incremental materialization is a coordination layer in FeatureStore.materialize_incremental that computes time windows and delegates to the provider. The actual data extraction logic is handled by the specific offline store implementation (BigQuery, Snowflake, Redshift, etc.), so incremental materialization works with any supported offline store as long as the underlying store supports time-range queries.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →