Memory Footprint Comparison: Lazy vs Eager Dataset Loading in Roboflow Supervision
Lazy loading maintains a constant memory footprint by storing only file paths and loading images on demand, whereas eager loading stores all image arrays in RAM, causing memory usage to scale linearly with dataset size.
The DetectionDataset class in Roboflow Supervision provides two distinct initialization strategies that fundamentally alter resource consumption. Understanding the memory footprint comparison between lazy and eager dataset loading is critical when processing large computer vision datasets that may contain hundreds of gigabytes of imagery.
Understanding the Two Loading Modes
Lazy Loading: Path-Based Storage
When you instantiate DetectionDataset with a list[str] of file paths, the library operates in lazy loading mode. According to the implementation in src/supervision/dataset/core.py (lines 74-99), the constructor stores only the strings in self.image_paths. The actual pixel data remains on disk until explicitly requested through the _get_image() method.
Memory characteristics:
- Only file paths reside in RAM (typically 50-200 bytes per path)
- Images are loaded via
cv2.imreadinside_get_image()when accessed during iteration - Memory footprint equals roughly one image array plus small bookkeeping structures, regardless of total dataset size
Eager Loading: In-Memory Arrays
When you provide a dict[str, np.ndarray] mapping paths to pre-loaded arrays, the dataset stores all images in self._images_in_memory at construction time. This approach eliminates disk I/O during training but dramatically increases RAM requirements.
Memory characteristics:
- All arrays stored in
self._images_in_memoryimmediately upon instantiation - Memory usage equals the sum of all image sizes (height × width × channels × bytes per pixel)
- Footprint grows linearly with dataset size N
Implementation Details in Supervision
The distinction between loading strategies is enforced through type checking in the constructor and specialized helper methods.
Constructor Logic
The __init__ method in src/supervision/dataset/core.py initializes the storage containers:
def __init__(self, classes, images, annotations):
# ...
self._images_in_memory: dict[str, np.ndarray] = {}
Image Retrieval Mechanism
The _get_image() method (lines 101-108) handles retrieval differently based on the loading mode:
def _get_image(self, image_path):
if self._images_in_memory:
return self._images_in_memory[image_path] # eager path: dict lookup
image = cv2.imread(image_path) # lazy path: disk read
return image
Helper Predicates
Supervision provides inspection functions (lines 279-284) to programmatically detect the current mode:
def is_in_memory(dataset):
return len(dataset._images_in_memory) > 0 or len(dataset.image_paths) == 0
def is_lazy(dataset):
return len(dataset._images_in_memory) == 0
Practical Code Examples
Lazy Loading Example
Pass a list of strings to keep memory usage minimal for large datasets:
import supervision as sv
image_paths = ["img1.jpg", "img2.jpg", "img3.jpg"]
annotations = {p: sv.Detections.empty() for p in image_paths}
ds_lazy = sv.DetectionDataset(
classes=["cat", "dog"],
images=image_paths, # list[str] triggers lazy loading
annotations=annotations,
)
# Memory footprint remains constant; only current image loaded during iteration
for path, img, ann in ds_lazy:
print(path, img.shape) # cv2.imread called here, one at a time
Eager Loading Example
Pass a dictionary of arrays when the dataset fits comfortably in RAM:
import cv2
import supervision as sv
image_paths = ["img1.jpg", "img2.jpg", "img3.jpg"]
images = {p: cv2.imread(p) for p in image_paths} # Load all upfront
annotations = {p: sv.Detections.empty() for p in image_paths}
ds_eager = sv.DetectionDataset(
classes=["cat", "dog"],
images=images, # dict[str, np.ndarray] triggers eager loading
annotations=annotations,
)
# All three images already resident in RAM; retrieval is O(1) dict lookup
for path, img, ann in ds_eager:
print(path, img.shape)
Performance and Resource Implications
Lazy loading is optimal when training on datasets that exceed available RAM. Because only self.image_paths (a list of strings) resides in memory, you can work with terabyte-scale collections on modest hardware. The trade-off is disk I/O latency during iteration, as cv2.imread() executes for every access.
Eager loading maximizes iteration speed by storing all data in self._images_in_memory, eliminating disk bottlenecks during training loops. This mode suits smaller datasets that fit entirely in memory but becomes prohibitive when working with high-resolution medical imagery or video frames.
Summary
- Lazy loading stores only file paths in
self.image_paths, yielding constant memory usage regardless of whether the dataset contains 100 or 100,000 images - Eager loading stores all arrays in
self._images_in_memory, causing linear memory growth proportional to total pixel count (height × width × 3 × N) - The constructor in
src/supervision/dataset/core.pyautomatically selects the mode based on whether you pass alist[str]ordict[str, np.ndarray] - Use
is_lazy()andis_in_memory()helpers (lines 279-284) to programmatically detect the current loading strategy before operations likemerge - Choose lazy loading for large-scale training pipelines and eager loading for small datasets requiring maximum throughput
Frequently Asked Questions
How do I check if my DetectionDataset is using lazy or eager loading?
Supervision provides inspection utilities in src/supervision/dataset/core.py (lines 279-284). The is_lazy() function returns True when len(dataset._images_in_memory) == 0, indicating only file paths are stored. Conversely, is_in_memory() returns True when the dataset contains cached arrays or no image paths, signaling that all data resides in RAM.
Can I convert a lazy dataset to eager loading after instantiation?
While DetectionDataset does not provide a built-in load_all() method, you can manually populate dataset._images_in_memory by iterating through the dataset and assigning the loaded arrays back to the internal dictionary. Each access in lazy mode calls _get_image(), which uses cv2.imread() to load the file on demand. Caching these values transforms the dataset into eager mode, though you must ensure sufficient RAM exists for the complete collection.
What is the memory overhead of lazy loading for a dataset with 50,000 images?
Lazy loading stores only the file path strings rather than the image arrays. For 50,000 images with average path lengths of 100 characters, the memory overhead remains under 5 MB regardless of image resolution. In contrast, eager loading the same dataset with 1920×1080×3 images would require approximately 300 GB of RAM (assuming uint8 pixels), making lazy loading essential for large-scale computer vision workflows.
Does lazy loading affect training speed compared to eager loading?
Yes, lazy loading introduces disk I/O overhead since cv2.imread() executes during each iteration. Eager loading provides faster access through dictionary lookups in self._images_in_memory but requires sufficient RAM to hold the entire dataset. When using NVMe SSDs with sequential access patterns, the speed difference may be negligible, but traditional HDDs or random access patterns will introduce significant bottlenecks during epoch iteration.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →