How to Configure Multi-Class Object Tracking in ByteTrack
ByteTrack is inherently class-agnostic, meaning you enable multi-class object tracking by simply passing detection objects that include class identifiers—the tracker handles bounding box geometry and confidence scores without internally separating objects by category.
ByteTrack is a fast, online multi-object tracker widely used in computer vision pipelines. When implemented through the supervision library, it processes raw detection tensors to maintain persistent object identities across video frames. Because the tracker focuses exclusively on bounding box coordinates and confidence values in src/supervision/tracker/byte_tracker/core.py, it naturally supports multi-class scenarios without requiring class-specific configuration.
How ByteTrack Handles Classes Internally
The core tracking logic does not reference class labels during the association process. According to the source implementation, the ByteTrack.__init__ method (lines 43-50) initializes Kalman filter objects and threshold parameters, while the update pipeline processes all detections uniformly regardless of category.
In ByteTrack.update_with_tensors (lines 89-100), detections with scores exceeding track_activation_threshold are converted to STrack objects defined in single_object_track.py. The Kalman prediction step in STrack.multi_predict (lines 63-78) operates purely on bounding box dynamics. The association phase in matching.py (lines 44-75) builds cost matrices using IoU distance and fused detection scores, applying the Hungarian algorithm via matching.linear_assignment without considering class labels.
The class_id field from the Detections object is carried through the pipeline and mapped back to output in ByteTrack.update_with_detections (lines 28-34), but it never influences track creation or matching decisions in the internal logic.
Basic Multi-Class Tracking Configuration
To track multiple classes simultaneously, instantiate ByteTrack with your desired thresholds and ensure your detector provides class information in the Detections object.
import supervision as sv
from ultralytics import YOLO
import numpy as np
model = YOLO("yolov8s.pt")
tracker = sv.ByteTrack(
track_activation_threshold=0.25,
minimum_matching_threshold=0.8,
minimum_consecutive_frames=1,
frame_rate=30
)
def process_frame(frame: np.ndarray) -> np.ndarray:
results = model(frame)[0]
detections = sv.Detections.from_ultralytics(results)
detections = tracker.update_with_detections(detections)
labels = [
f"{results.names[int(cls)]} #{tid}"
for cls, tid in zip(detections.class_id, detections.tracker_id)
]
box_annotator = sv.BoxAnnotator()
annotated = box_annotator.annotate(frame.copy(), detections)
return annotated
The update_with_detections method returns a modified Detections object where the tracker_id array contains persistent IDs for each bounding box, while the original class_id array preserves the category information from your detector. Both arrays maintain index alignment, allowing direct correlation between track IDs and class labels.
Advanced Multi-Class Workflows
Building Per-Class Track Collections
Since ByteTrack does not maintain separate buffers for different classes, you must group tracks by class in post-processing. Use the class_id and tracker_id arrays to organize active tracks into category-specific collections.
from collections import defaultdict
active_tracks_by_class = defaultdict(set)
for cls_id, trk_id in zip(detections.class_id, detections.tracker_id):
if trk_id != -1:
active_tracks_by_class[int(cls_id)].add(int(trk_id))
for cls_id, tracks in active_tracks_by_class.items():
print(f"Class {cls_id}: {len(tracks)} active tracks")
This approach leverages the fact that ByteTrack assigns unique tracker IDs across all classes, letting you filter and count per-class statistics after the tracking update completes.
Configuring Class-Specific Thresholds
For scenarios requiring different tracking parameters per category, instantiate separate ByteTrack objects for each class and process detection subsets individually.
trackers = {
0: sv.ByteTrack(track_activation_threshold=0.3), # pedestrians
1: sv.ByteTrack(track_activation_threshold=0.2), # vehicles
}
def process_frame(frame):
results = model(frame)[0]
detections = sv.Detections.from_ultralytics(results)
merged = sv.Detections.empty()
for cls_id, tracker in trackers.items():
mask = detections.class_id == cls_id
class_dets = detections[mask]
tracked = tracker.update_with_detections(class_dets)
merged = sv.Detections.concatenate([merged, tracked])
return merged
Each tracker instance maintains its own internal Kalman filters and track buffers, allowing fine-grained control over activation thresholds and matching behavior for specific object categories.
Summary
- ByteTrack processes bounding boxes and confidence scores independently of class labels, making it naturally suited for multi-class tracking in
supervision. - The
class_idfield in theDetectionsobject is preserved through the tracking pipeline incore.pybut does not influence the Kalman prediction or Hungarian matching algorithms insingle_object_track.pyandmatching.py. - Per-class statistics and filtering must be implemented in post-processing by grouping
tracker_idvalues according to their correspondingclass_id. - For class-specific tracking behaviors, initialize separate tracker instances with distinct threshold configurations and process each class subset individually.
Frequently Asked Questions
Does ByteTrack automatically separate tracks by object class?
No. ByteTrack is class-agnostic and treats all detections uniformly during the association phase in matching.py. The algorithm uses IoU-based cost matrices and Kalman predictions from single_object_track.py without referencing class labels. You must handle class-specific logic after calling update_with_detections by filtering the parallel class_id and tracker_id arrays.
Where does the tracker store class information in the output?
The Detections object maintains parallel numpy arrays where detections.class_id contains the original class index from your detector and detections.tracker_id contains the assigned track ID from ByteTrack.update_with_detections. These arrays are aligned by index, allowing you to correlate every tracked object with its category using standard array indexing operations.
Can I assign different confidence thresholds to different object classes?
Yes, but this requires running multiple tracker instances. Because track_activation_threshold is set at initialization in ByteTrack.__init__, you cannot dynamically adjust it per class within a single tracker. Instead, filter detections by class before tracking and route each subset to a dedicated tracker instance with appropriate thresholds, then concatenate the results using sv.Detections.concatenate.
How do I handle occlusions between different object classes?
ByteTrack handles occlusions using the same mechanism for all classes: unmatched tracks are buffered for a configurable number of frames based on frame_rate and minimum_consecutive_frames parameters. Since the tracker does not distinguish between classes during the matching phase in matching.linear_assignment, occlusions between different classes are resolved purely based on IoU overlap and motion prediction from the shared Kalman filter implementation.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →