# DetectionsSmoother Implementation for Temporal Stability in Roboflow Supervision

> Enhance temporal stability in Roboflow Supervision with DetectionsSmoother. This implementation smooths detection jitter by averaging bounding boxes over a rolling window per tracker ID.

- Repository: [Roboflow/supervision](https://github.com/roboflow/supervision)
- Tags: deep-dive
- Published: 2026-04-06

---

**The `DetectionsSmoother` class reduces detection jitter by maintaining a rolling window of bounding box coordinates per tracker ID and computing mean averages over consecutive video frames.**

The `DetectionsSmoother` utility provides temporal stability for object detection pipelines without requiring complex tracking reimplementation. Implemented in the `roboflow/supervision` repository, this class works alongside existing trackers like **ByteTrack** to smooth bounding box coordinates and confidence scores across time. It operates track-by-track using a fixed-length history buffer, making it ideal for stabilizing noisy detector outputs in video processing applications.

## Core Architecture and State Management

The smoother maintains isolated state for each tracked object using a specialized data structure defined in [`src/supervision/detection/tools/smoother.py`](https://github.com/roboflow/supervision/blob/main/src/supervision/detection/tools/smoother.py). During initialization, the constructor creates a `defaultdict[int, deque[Detections | None]]` that maps each `tracker_id` to a bounded queue of detection history.

```python

# Conceptual representation of internal state

self.tracks: defaultdict[int, deque] = defaultdict(
    lambda: deque(maxlen=self.length)
)

```

The default **history length** is set to `5` frames, though this is configurable via the constructor. This circular buffer automatically discards the oldest detection when the capacity is exceeded, ensuring constant memory usage regardless of video duration.

## The Update and Smoothing Pipeline

### Updating Frame History

The `update_with_detections` method processes incoming detections frame-by-frame. For each detection in the current frame, it extracts the `tracker_id` and appends a single-detection slice to the corresponding deque.

As implemented in [`src/supervision/detection/tools/smoother.py`](https://github.com/roboflow/supervision/blob/main/src/supervision/detection/tools/smoother.py) (lines 14-19), the method handles the per-track update logic:

```python
for detection_idx in range(len(detections)):
    tracker_id = detections.tracker_id[detection_idx]
    detection = detections[detection_idx]
    self.tracks[tracker_id].append(detection)

```

When a track present in previous frames disappears in the current frame, the smoother inserts a `None` placeholder to maintain temporal alignment (lines 20-23). Tracks that become entirely empty—all entries are `None`—are automatically removed from the state dictionary to prevent memory leaks (lines 24-27).

### Computing Smoothed Detections

The `get_track` method (lines 35-42) computes the smoothed representation for a single track by filtering out `None` values and averaging the remaining detection parameters:

1. **Bounding box averaging**: The `xyxy` arrays from all valid detections are stacked and averaged along the time axis
2. **Confidence smoothing**: Confidence scores undergo the same averaging operation
3. **Result construction**: A deep-copied `Detections` instance is returned with the averaged values

The `get_smoothed_detections` method (lines 45-52) aggregates these single-track results into a consolidated `Detections` object. If no detections survive the merge operation, the method returns an empty detections object with a properly initialized `tracker_id` array to maintain type consistency (lines 53-56).

## Implementation Details in supervision/detection/tools/smoother.py

According to the source code in the `roboflow/supervision` repository, the smoothing logic relies on the **`Detections`** class defined in [`src/supervision/detection/core.py`](https://github.com/roboflow/supervision/blob/main/src/supervision/detection/core.py). The implementation specifically requires that input detections contain valid `tracker_id` assignments, which are typically provided by tracker implementations such as **ByteTrack** ([`src/supervision/tracker/byte_tracker.py`](https://github.com/roboflow/supervision/blob/main/src/supervision/tracker/byte_tracker.py)).

The averaging mechanism uses NumPy operations to compute element-wise means across the detection history. This approach effectively implements a **moving average filter** on the bounding box coordinates, reducing high-frequency jitter while preserving legitimate object motion.

Key implementation characteristics:

- **Track isolation**: Each `tracker_id` maintains independent history buffers
- **Missing data handling**: `None` placeholders prevent temporal drift when objects are temporarily occluded
- **Automatic cleanup**: Dead tracks are purged when all historical entries are `None`

## Practical Code Examples

### Video Processing Pipeline

The following example demonstrates integrating `DetectionsSmoother` with YOLO and ByteTrack for stable video annotation:

```python
import supervision as sv
from ultralytics import YOLO

video_path = "<YOUR_VIDEO>"
video_info = sv.VideoInfo.from_video_path(video_path)
frame_gen = sv.get_video_frames_generator(video_path)

model = YOLO("<YOLO_WEIGHTS>")
tracker = sv.ByteTrack(frame_rate=video_info.fps)
smoother = sv.DetectionsSmoother(length=3)  # 3-frame history

box_annotator = sv.BoxAnnotator()

with sv.VideoSink("out.mp4", video_info=video_info) as sink:
    for frame in frame_gen:
        result = model(frame)[0]
        detections = sv.Detections.from_ultralytics(result)
        
        # Assign stable IDs then smooth temporally

        detections = tracker.update_with_detections(detections)
        detections = smoother.update_with_detections(detections)
        
        annotated = box_annotator.annotate(frame.copy(), detections)
        sink.write_frame(annotated)

```

### Synthetic Data Verification

For testing or demonstration purposes, you can verify the averaging behavior with synthetic inputs:

```python
import numpy as np
import supervision as sv

smoother = sv.DetectionsSmoother(length=2)

d1 = sv.Detections(
    xyxy=np.array([[0, 0, 10, 10]]),
    confidence=np.array([0.5]),
    tracker_id=np.array([1])
)
d2 = sv.Detections(
    xyxy=np.array([[2, 2, 12, 12]]),
    confidence=np.array([0.7]),
    tracker_id=np.array([1])
)

print(smoother.update_with_detections(d1).xyxy)  # [[0, 0, 10, 10]]

print(smoother.update_with_detections(d2).xyxy)  # [[1, 1, 11, 11]] (averaged)

```

## Summary

The `DetectionsSmoother` implementation in `roboflow/supervision` provides robust temporal stability through these key mechanisms:

- **Fixed-length circular buffers** store per-track detection history using `deque` objects with configurable `length` parameters
- **Moving average filtering** reduces coordinate jitter by averaging `xyxy` bounding boxes and confidence scores across frames
- **Occlusion handling** uses `None` placeholders to maintain temporal alignment when objects temporarily disappear
- **Automatic memory management** removes stale tracks when all history entries become `None`
- **Tracker agnostic design** works with any detection pipeline that provides consistent `tracker_id` values, such as ByteTrack

## Frequently Asked Questions

### What is the default history length in DetectionsSmoother?

The default history length is **5 frames**, defined in the constructor of [`src/supervision/detection/tools/smoother.py`](https://github.com/roboflow/supervision/blob/main/src/supervision/detection/tools/smoother.py). You can customize this via the `length` parameter when instantiating the class, with shorter windows providing faster response to motion changes and longer windows offering greater stability.

### How does DetectionsSmoother handle temporary occlusions or missing detections?

When a tracked object disappears from the current frame, the smoother inserts a `None` placeholder into that track's history deque rather than removing the track entirely. This maintains temporal alignment so that when the object reappears, the averaging calculation correctly spans the gap. Tracks are only removed from internal state when every entry in their history becomes `None`.

### Can DetectionsSmoother work without a tracker?

No, `DetectionsSmoother` requires that input detections contain valid `tracker_id` values assigned by a tracking algorithm such as ByteTrack. The smoother does not perform data association or identity preservation itself; it only averages coordinates for detections sharing the same pre-assigned tracker ID.

### Does temporal smoothing affect detection latency?

The smoothing process introduces **zero frame latency** because it operates on historical data already received. However, the output represents the average of recent frames rather than the instantaneous current frame. For applications requiring real-time responsiveness with minimal lag, use a shorter `length` parameter (e.g., 2-3 frames) to balance stability against immediacy.