How to Configure Multi-Class Object Tracking in ByteTrack

ByteTrack is inherently class-agnostic, meaning you enable multi-class object tracking by simply passing detection objects that include class identifiers—the tracker handles bounding box geometry and confidence scores without internally separating objects by category.

ByteTrack is a fast, online multi-object tracker widely used in computer vision pipelines. When implemented through the supervision library, it processes raw detection tensors to maintain persistent object identities across video frames. Because the tracker focuses exclusively on bounding box coordinates and confidence values in src/supervision/tracker/byte_tracker/core.py, it naturally supports multi-class scenarios without requiring class-specific configuration.

How ByteTrack Handles Classes Internally

The core tracking logic does not reference class labels during the association process. According to the source implementation, the ByteTrack.__init__ method (lines 43-50) initializes Kalman filter objects and threshold parameters, while the update pipeline processes all detections uniformly regardless of category.

In ByteTrack.update_with_tensors (lines 89-100), detections with scores exceeding track_activation_threshold are converted to STrack objects defined in single_object_track.py. The Kalman prediction step in STrack.multi_predict (lines 63-78) operates purely on bounding box dynamics. The association phase in matching.py (lines 44-75) builds cost matrices using IoU distance and fused detection scores, applying the Hungarian algorithm via matching.linear_assignment without considering class labels.

The class_id field from the Detections object is carried through the pipeline and mapped back to output in ByteTrack.update_with_detections (lines 28-34), but it never influences track creation or matching decisions in the internal logic.

Basic Multi-Class Tracking Configuration

To track multiple classes simultaneously, instantiate ByteTrack with your desired thresholds and ensure your detector provides class information in the Detections object.

import supervision as sv
from ultralytics import YOLO
import numpy as np

model = YOLO("yolov8s.pt")
tracker = sv.ByteTrack(
    track_activation_threshold=0.25,
    minimum_matching_threshold=0.8,
    minimum_consecutive_frames=1,
    frame_rate=30
)

def process_frame(frame: np.ndarray) -> np.ndarray:
    results = model(frame)[0]
    detections = sv.Detections.from_ultralytics(results)
    detections = tracker.update_with_detections(detections)
    
    labels = [
        f"{results.names[int(cls)]} #{tid}"
        for cls, tid in zip(detections.class_id, detections.tracker_id)
    ]
    
    box_annotator = sv.BoxAnnotator()
    annotated = box_annotator.annotate(frame.copy(), detections)
    return annotated

The update_with_detections method returns a modified Detections object where the tracker_id array contains persistent IDs for each bounding box, while the original class_id array preserves the category information from your detector. Both arrays maintain index alignment, allowing direct correlation between track IDs and class labels.

Advanced Multi-Class Workflows

Building Per-Class Track Collections

Since ByteTrack does not maintain separate buffers for different classes, you must group tracks by class in post-processing. Use the class_id and tracker_id arrays to organize active tracks into category-specific collections.

from collections import defaultdict

active_tracks_by_class = defaultdict(set)

for cls_id, trk_id in zip(detections.class_id, detections.tracker_id):
    if trk_id != -1:
        active_tracks_by_class[int(cls_id)].add(int(trk_id))

for cls_id, tracks in active_tracks_by_class.items():
    print(f"Class {cls_id}: {len(tracks)} active tracks")

This approach leverages the fact that ByteTrack assigns unique tracker IDs across all classes, letting you filter and count per-class statistics after the tracking update completes.

Configuring Class-Specific Thresholds

For scenarios requiring different tracking parameters per category, instantiate separate ByteTrack objects for each class and process detection subsets individually.

trackers = {
    0: sv.ByteTrack(track_activation_threshold=0.3),  # pedestrians

    1: sv.ByteTrack(track_activation_threshold=0.2),  # vehicles

}

def process_frame(frame):
    results = model(frame)[0]
    detections = sv.Detections.from_ultralytics(results)
    merged = sv.Detections.empty()
    
    for cls_id, tracker in trackers.items():
        mask = detections.class_id == cls_id
        class_dets = detections[mask]
        tracked = tracker.update_with_detections(class_dets)
        merged = sv.Detections.concatenate([merged, tracked])
    
    return merged

Each tracker instance maintains its own internal Kalman filters and track buffers, allowing fine-grained control over activation thresholds and matching behavior for specific object categories.

Summary

  • ByteTrack processes bounding boxes and confidence scores independently of class labels, making it naturally suited for multi-class tracking in supervision.
  • The class_id field in the Detections object is preserved through the tracking pipeline in core.py but does not influence the Kalman prediction or Hungarian matching algorithms in single_object_track.py and matching.py.
  • Per-class statistics and filtering must be implemented in post-processing by grouping tracker_id values according to their corresponding class_id.
  • For class-specific tracking behaviors, initialize separate tracker instances with distinct threshold configurations and process each class subset individually.

Frequently Asked Questions

Does ByteTrack automatically separate tracks by object class?

No. ByteTrack is class-agnostic and treats all detections uniformly during the association phase in matching.py. The algorithm uses IoU-based cost matrices and Kalman predictions from single_object_track.py without referencing class labels. You must handle class-specific logic after calling update_with_detections by filtering the parallel class_id and tracker_id arrays.

Where does the tracker store class information in the output?

The Detections object maintains parallel numpy arrays where detections.class_id contains the original class index from your detector and detections.tracker_id contains the assigned track ID from ByteTrack.update_with_detections. These arrays are aligned by index, allowing you to correlate every tracked object with its category using standard array indexing operations.

Can I assign different confidence thresholds to different object classes?

Yes, but this requires running multiple tracker instances. Because track_activation_threshold is set at initialization in ByteTrack.__init__, you cannot dynamically adjust it per class within a single tracker. Instead, filter detections by class before tracking and route each subset to a dedicated tracker instance with appropriate thresholds, then concatenate the results using sv.Detections.concatenate.

How do I handle occlusions between different object classes?

ByteTrack handles occlusions using the same mechanism for all classes: unmatched tracks are buffered for a configurable number of frames based on frame_rate and minimum_consecutive_frames parameters. Since the tracker does not distinguish between classes during the matching phase in matching.linear_assignment, occlusions between different classes are resolved purely based on IoU overlap and motion prediction from the shared Kalman filter implementation.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →