Memory Footprint Comparison: Lazy vs Eager Dataset Loading in Roboflow Supervision

Lazy loading maintains a constant memory footprint by storing only file paths and loading images on demand, whereas eager loading stores all image arrays in RAM, causing memory usage to scale linearly with dataset size.

The DetectionDataset class in Roboflow Supervision provides two distinct initialization strategies that fundamentally alter resource consumption. Understanding the memory footprint comparison between lazy and eager dataset loading is critical when processing large computer vision datasets that may contain hundreds of gigabytes of imagery.

Understanding the Two Loading Modes

Lazy Loading: Path-Based Storage

When you instantiate DetectionDataset with a list[str] of file paths, the library operates in lazy loading mode. According to the implementation in src/supervision/dataset/core.py (lines 74-99), the constructor stores only the strings in self.image_paths. The actual pixel data remains on disk until explicitly requested through the _get_image() method.

Memory characteristics:

  • Only file paths reside in RAM (typically 50-200 bytes per path)
  • Images are loaded via cv2.imread inside _get_image() when accessed during iteration
  • Memory footprint equals roughly one image array plus small bookkeeping structures, regardless of total dataset size

Eager Loading: In-Memory Arrays

When you provide a dict[str, np.ndarray] mapping paths to pre-loaded arrays, the dataset stores all images in self._images_in_memory at construction time. This approach eliminates disk I/O during training but dramatically increases RAM requirements.

Memory characteristics:

  • All arrays stored in self._images_in_memory immediately upon instantiation
  • Memory usage equals the sum of all image sizes (height × width × channels × bytes per pixel)
  • Footprint grows linearly with dataset size N

Implementation Details in Supervision

The distinction between loading strategies is enforced through type checking in the constructor and specialized helper methods.

Constructor Logic

The __init__ method in src/supervision/dataset/core.py initializes the storage containers:

def __init__(self, classes, images, annotations):
    # ...

    self._images_in_memory: dict[str, np.ndarray] = {}

Image Retrieval Mechanism

The _get_image() method (lines 101-108) handles retrieval differently based on the loading mode:

def _get_image(self, image_path):
    if self._images_in_memory:
        return self._images_in_memory[image_path]   # eager path: dict lookup

    image = cv2.imread(image_path)                 # lazy path: disk read

    return image

Helper Predicates

Supervision provides inspection functions (lines 279-284) to programmatically detect the current mode:

def is_in_memory(dataset):
    return len(dataset._images_in_memory) > 0 or len(dataset.image_paths) == 0

def is_lazy(dataset):
    return len(dataset._images_in_memory) == 0

Practical Code Examples

Lazy Loading Example

Pass a list of strings to keep memory usage minimal for large datasets:

import supervision as sv

image_paths = ["img1.jpg", "img2.jpg", "img3.jpg"]
annotations = {p: sv.Detections.empty() for p in image_paths}

ds_lazy = sv.DetectionDataset(
    classes=["cat", "dog"],
    images=image_paths,          # list[str] triggers lazy loading

    annotations=annotations,
)

# Memory footprint remains constant; only current image loaded during iteration

for path, img, ann in ds_lazy:
    print(path, img.shape)       # cv2.imread called here, one at a time

Eager Loading Example

Pass a dictionary of arrays when the dataset fits comfortably in RAM:

import cv2
import supervision as sv

image_paths = ["img1.jpg", "img2.jpg", "img3.jpg"]
images = {p: cv2.imread(p) for p in image_paths}   # Load all upfront

annotations = {p: sv.Detections.empty() for p in image_paths}

ds_eager = sv.DetectionDataset(
    classes=["cat", "dog"],
    images=images,              # dict[str, np.ndarray] triggers eager loading

    annotations=annotations,
)

# All three images already resident in RAM; retrieval is O(1) dict lookup

for path, img, ann in ds_eager:
    print(path, img.shape)

Performance and Resource Implications

Lazy loading is optimal when training on datasets that exceed available RAM. Because only self.image_paths (a list of strings) resides in memory, you can work with terabyte-scale collections on modest hardware. The trade-off is disk I/O latency during iteration, as cv2.imread() executes for every access.

Eager loading maximizes iteration speed by storing all data in self._images_in_memory, eliminating disk bottlenecks during training loops. This mode suits smaller datasets that fit entirely in memory but becomes prohibitive when working with high-resolution medical imagery or video frames.

Summary

  • Lazy loading stores only file paths in self.image_paths, yielding constant memory usage regardless of whether the dataset contains 100 or 100,000 images
  • Eager loading stores all arrays in self._images_in_memory, causing linear memory growth proportional to total pixel count (height × width × 3 × N)
  • The constructor in src/supervision/dataset/core.py automatically selects the mode based on whether you pass a list[str] or dict[str, np.ndarray]
  • Use is_lazy() and is_in_memory() helpers (lines 279-284) to programmatically detect the current loading strategy before operations like merge
  • Choose lazy loading for large-scale training pipelines and eager loading for small datasets requiring maximum throughput

Frequently Asked Questions

How do I check if my DetectionDataset is using lazy or eager loading?

Supervision provides inspection utilities in src/supervision/dataset/core.py (lines 279-284). The is_lazy() function returns True when len(dataset._images_in_memory) == 0, indicating only file paths are stored. Conversely, is_in_memory() returns True when the dataset contains cached arrays or no image paths, signaling that all data resides in RAM.

Can I convert a lazy dataset to eager loading after instantiation?

While DetectionDataset does not provide a built-in load_all() method, you can manually populate dataset._images_in_memory by iterating through the dataset and assigning the loaded arrays back to the internal dictionary. Each access in lazy mode calls _get_image(), which uses cv2.imread() to load the file on demand. Caching these values transforms the dataset into eager mode, though you must ensure sufficient RAM exists for the complete collection.

What is the memory overhead of lazy loading for a dataset with 50,000 images?

Lazy loading stores only the file path strings rather than the image arrays. For 50,000 images with average path lengths of 100 characters, the memory overhead remains under 5 MB regardless of image resolution. In contrast, eager loading the same dataset with 1920×1080×3 images would require approximately 300 GB of RAM (assuming uint8 pixels), making lazy loading essential for large-scale computer vision workflows.

Does lazy loading affect training speed compared to eager loading?

Yes, lazy loading introduces disk I/O overhead since cv2.imread() executes during each iteration. Eager loading provides faster access through dictionary lookups in self._images_in_memory but requires sufficient RAM to hold the entire dataset. When using NVMe SSDs with sequential access patterns, the speed difference may be negligible, but traditional HDDs or random access patterns will introduce significant bottlenecks during epoch iteration.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →