# Feast Monitoring and Observability: Prometheus Metrics, OpenTelemetry Traces, and Data Quality Validation

> Explore Feast monitoring and observability with Prometheus metrics, OpenTelemetry traces, and data quality validation. Configure your stack via Helm and Kubernetes.

- Repository: [Feast/feast](https://github.com/feast-dev/feast)
- Tags: getting-started
- Published: 2026-03-01

---

**Feast provides a comprehensive observability stack including Prometheus-compatible metrics, OpenTelemetry distributed tracing, and experimental data quality monitoring, all configurable through Helm values and Kubernetes ServiceMonitors.**

Feast (feast-dev/feast) is an open-source feature store for machine learning that includes production-grade **monitoring and observability** capabilities. The platform exposes Prometheus metrics from both the feature server and operator, integrates with OpenTelemetry for distributed tracing, and supports data quality validation through configurable Helm deployments.

## Core Observability Components

### Prometheus Metrics Collection

Feast exposes **Prometheus-compatible metrics** via `/metrics` endpoints on both the feature server and the Feast operator. These endpoints emit critical telemetry including CPU usage, memory consumption, request latency, and feature-retrieval statistics.

The operator’s metrics endpoint is defined in [`infra/feast-operator/config/prometheus/monitor.yaml`](https://github.com/feast-dev/feast/blob/main/infra/feast-operator/config/prometheus/monitor.yaml), which configures a `ServiceMonitor` resource that Prometheus uses to discover and scrape the controller manager. For the feature server, the Helm chart includes sample monitoring resources in [`infra/charts/feast-feature-server/samples/service-monitor.yaml`](https://github.com/feast-dev/feast/blob/main/infra/charts/feast-feature-server/samples/service-monitor.yaml) that define how Prometheus should scrape the OpenTelemetry Collector and application metrics.

Key metrics exposed include `feast_feature_server_latency_seconds` for request timing and `feast_feature_server_memory_usage` for resource tracking.

### Distributed Tracing with OpenTelemetry

Feast integrates with the **OpenTelemetry Collector** to capture distributed traces and structured logs across the feature retrieval pipeline. The Python SDK supports auto-instrumentation through Kubernetes annotations, requiring no code changes to enable tracing.

When you deploy Feast with the OpenTelemetry Collector, traces are forwarded via OTLP (OpenTelemetry Protocol) to compatible backends such as Jaeger, Zipkin, or Tempo. Configuration details and deployment patterns are documented in [`docs/getting-started/components/open-telemetry.md`](https://github.com/feast-dev/feast/blob/main/docs/getting-started/components/open-telemetry.md).

To enable auto-instrumentation, add the following annotation to your deployment manifest:

```yaml
metadata:
  annotations:
    instrumentation.opentelemetry.io/inject-python: "true"

```

### Data Quality Monitoring (Experimental)

For validating training and serving datasets, Feast includes an experimental **data quality monitoring** system built on Great Expectations. This implementation, located in the `dqm/` package, tracks data drift and skew between training sets and live serving features.

The system validates datasets against predefined expectations and surfaces quality metrics that can be consumed by your existing monitoring infrastructure. Reference documentation for this feature is available in [`docs/reference/dqm.md`](https://github.com/feast-dev/feast/blob/main/docs/reference/dqm.md).

## Kubernetes-Native Monitoring Setup

### Enabling Metrics in Helm

You activate the observability stack through Helm values when deploying the Feast feature server. The configuration toggles expose the necessary endpoints and configure the OpenTelemetry Collector forwarding address.

Here is a sample [`values.yaml`](https://github.com/feast-dev/feast/blob/main/values.yaml) configuration:

```yaml
metrics:
  enabled: true                # Expose Prometheus metrics endpoints

  otelCollector:
    endpoint: "otel-collector.default.svc.cluster.local:4317"
    headers:
      api-key: "YOUR_API_KEY"  # Optional authentication header

```

Deploy with the command:

```bash
helm install feast-release infra/charts/feast-feature-server \
  --set metrics.enabled=true \
  --set feature_store_yaml_base64=""

```

### Configuring ServiceMonitors

For Prometheus Operator users, Feast provides sample `ServiceMonitor` resources that automate metric discovery.

The following configuration from [`infra/feast-operator/config/prometheus/monitor.yaml`](https://github.com/feast-dev/feast/blob/main/infra/feast-operator/config/prometheus/monitor.yaml) sets up monitoring for the Feast operator:

```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    control-plane: controller-manager
    app.kubernetes.io/name: feast-operator
  name: controller-manager-metrics-monitor
  namespace: system
spec:
  endpoints:
    - path: /metrics
      port: https
      scheme: https
      bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      tlsConfig:
        insecureSkipVerify: true
  selector:
    matchLabels:
      control-plane: controller-manager

```

For the OpenTelemetry Collector itself, use the sample from [`infra/charts/feast-feature-server/samples/service-monitor.yaml`](https://github.com/feast-dev/feast/blob/main/infra/charts/feast-feature-server/samples/service-monitor.yaml):

```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: feast
  name: otel-sm
spec:
  endpoints:
    - port: metrics
  namespaceSelector:
    matchNames:
      - <namespace>
  selector:
    matchLabels:
      app.kubernetes.io/component: opentelemetry-collector
      app.kubernetes.io/managed-by: opentelemetry-operator

```

## Instrumenting the Python SDK

When you enable the `instrumentation.opentelemetry.io/inject-python: "true"` annotation on your Feast client pods, the OpenTelemetry Python agent automatically instruments the SDK at runtime. This injects tracing context into feature store requests and exposes additional runtime metrics without requiring changes to your application code.

The instrumentation captures end-to-end request flows from the client through the feature server to the underlying data stores, making it possible to diagnose latency bottlenecks and error sources across the entire feature retrieval path.

## Summary

- **Prometheus integration**: Feast exposes `/metrics` endpoints on both the feature server and operator, with ready-to-use `ServiceMonitor` definitions in [`infra/feast-operator/config/prometheus/monitor.yaml`](https://github.com/feast-dev/feast/blob/main/infra/feast-operator/config/prometheus/monitor.yaml) and the feature server chart samples.
- **OpenTelemetry support**: Distributed tracing and structured logging are available via the OpenTelemetry Collector, configured through Helm values and enabled via Kubernetes pod annotations.
- **Data quality validation**: Experimental Great Expectations integration in the `dqm/` package provides drift and skew detection for training and serving datasets.
- **Zero-code instrumentation**: The Python SDK supports auto-instrumentation through the `instrumentation.opentelemetry.io/inject-python` annotation, exposing metrics like `feast_feature_server_latency_seconds` automatically.

## Frequently Asked Questions

### How do I enable Prometheus metrics in Feast?

Set `metrics.enabled=true` in your Helm [`values.yaml`](https://github.com/feast-dev/feast/blob/main/values.yaml) file when deploying the feature server. This exposes the `/metrics` endpoint on the feature server and configures the necessary Kubernetes resources for Prometheus scraping. You must also apply the `ServiceMonitor` resources from [`infra/feast-operator/config/prometheus/monitor.yaml`](https://github.com/feast-dev/feast/blob/main/infra/feast-operator/config/prometheus/monitor.yaml) for the operator and [`infra/charts/feast-feature-server/samples/service-monitor.yaml`](https://github.com/feast-dev/feast/blob/main/infra/charts/feast-feature-server/samples/service-monitor.yaml) for the collector.

### What tracing backends does Feast support?

Feast supports any OTLP-compatible backend through the OpenTelemetry Collector. Common implementations include Jaeger, Zipkin, AWS X-Ray, and Grafana Tempo. You configure the backend endpoint using the `otelCollector.endpoint` Helm value, and traces are automatically forwarded from the instrumented Python SDK.

### How does Feast handle data quality monitoring?

Feast includes an experimental data quality monitoring system in the `dqm/` package that integrates with Great Expectations. This system validates datasets against predefined expectations and tracks statistical drift between training data and live serving features. Documentation is available in [`docs/reference/dqm.md`](https://github.com/feast-dev/feast/blob/main/docs/reference/dqm.md), though this feature is not yet considered production-stable.

### Is OpenTelemetry integration mandatory for Feast monitoring?

No, OpenTelemetry is optional. You can run Feast with only Prometheus metrics enabled by setting `metrics.enabled=true` while omitting the `otelCollector` configuration. However, enabling both provides the most complete observability coverage, correlating metric spikes with distributed trace data for faster incident resolution.