# How to Implement Multi-Resolution Detection with InferenceSlicer for Sliding Window Approach

> Implement multi-resolution detection using InferenceSlicer and a sliding window approach. Process large images efficiently by tiling, running detections, and merging results across scales.

- Repository: [Roboflow/supervision](https://github.com/roboflow/supervision)
- Tags: how-to-guide
- Published: 2026-04-06

---

**The `InferenceSlicer` in the `roboflow/supervision` library processes large images by dividing them into overlapping tiles, executing a user-provided detection callback on each tile, and merging results back to the original coordinate system; to achieve multi-resolution detection, run the slicer multiple times with different `slice_wh` values and combine the per-scale `Detections` objects.**

The `InferenceSlicer` class in `roboflow/supervision` provides a production-ready sliding-window implementation for computer vision workflows where models cannot process high-resolution imagery in a single forward pass. This guide demonstrates how to implement **multi-resolution detection**—executing inference at multiple scales simultaneously—to capture both fine-grained details and broad contextual information.

## How InferenceSlicer Works (Single-Scale Foundation)

Before implementing multi-resolution strategies, understanding the single-scale pipeline is essential. The slicer orchestrates four distinct phases implemented in [`src/supervision/detection/tools/inference_slicer.py`](https://github.com/roboflow/supervision/blob/main/src/supervision/detection/tools/inference_slicer.py).

### Tile Generation via `_generate_offset()`

The slicer creates sliding windows using the `_generate_offset()` method (lines 89-115), which calculates an array of `(x_min, y_min, x_max, y_max)` coordinates based on the `slice_wh` tuple (tile width/height) and `overlap_wh` tuple (horizontal/vertical overlap).

```python

# From inference_slicer.py - conceptual flow

offsets = self._generate_offset(
    image_wh=(image_width, image_height),
    slice_wh=self.slice_wh,
    overlap_wh=self.overlap_wh
)

```

Each offset represents a crop region that the slicer will extract and process independently.

### Parallel Callback Execution and Coordinate Correction

For each generated offset, the slicer spawns worker threads (controlled by `thread_workers`) that execute `_run_callback()` (lines 88-104). This internal method crops the image tile using `crop_image()` from [`src/supervision/utils/image.py`](https://github.com/roboflow/supervision/blob/main/src/supervision/utils/image.py), invokes the user-provided callback function, and shifts the returned detection coordinates back to the global image space using `move_detections()` (lines 22-52).

```python
def _run_callback(self, offset, image, callback):
    x_min, y_min, x_max, y_max = offset
    tile = crop_image(image, (x_min, y_min, x_max, y_max))
    detections = callback(tile)
    return move_detections(detections, (x_min, y_min))

```

### Overlap Handling and Result Merging

After all tiles process, the slicer merges individual `Detections` objects using `Detections.merge()` from [`src/supervision/detection/core.py`](https://github.com/roboflow/supervision/blob/main/src/supervision/detection/core.py). If `overlap_filter` is enabled (either `OverlapFilter.NON_MAX_SUPPRESSION` or `NON_MAX_MERGE`), the slicer applies the specified algorithm using `iou_threshold` to resolve duplicate detections appearing in overlapping tile regions.

## Implementing Multi-Resolution Detection

Multi-resolution detection requires orchestrating multiple `InferenceSlicer` instances—each configured with different tile sizes—and aggregating their results. Larger tiles (`slice_wh=1280`) capture broader contextual information, while smaller tiles (`slice_wh=640`) preserve fine details.

```python
import supervision as sv
from ultralytics import YOLO

model = YOLO("yolo11m.pt")

def tile_callback(tile):
    results = model(tile)[0]
    return sv.Detections.from_ultralytics(results)

def multi_resolution_detect(image, scales, overlap_wh=100, thread_workers=4):
    """
    Run InferenceSlicer at multiple resolutions and merge results.
    
    Args:
        image: Input image (numpy array)
        scales: List of tile sizes (e.g., [640, 960, 1280])
        overlap_wh: Overlap between adjacent tiles
        thread_workers: Parallel workers per scale
    """
    per_scale_detections = []
    
    for slice_wh in scales:
        slicer = sv.InferenceSlicer(
            callback=tile_callback,
            slice_wh=(slice_wh, slice_wh),  # square tiles

            overlap_wh=(overlap_wh, overlap_wh),
            overlap_filter=sv.OverlapFilter.NON_MAX_SUPPRESSION,
            iou_threshold=0.5,
            overlap_metric=sv.OverlapMetric.IOU,
            thread_workers=thread_workers,
        )
        detections = slicer(image)
        per_scale_detections.append(detections)
    
    # Merge all scale-specific detections

    merged = sv.Detections.merge(per_scale_detections)
    
    # Optional: Final cross-scale NMS to remove duplicates

    merged = merged.with_nms(
        threshold=0.5, 
        overlap_metric=sv.OverlapMetric.IOU
    )
    
    return merged

# Usage

image = sv.utils.image.read_image("large_aerial_image.jpg")
detections = multi_resolution_detect(
    image=image,
    scales=[640, 960, 1280],
    overlap_wh=100,
    thread_workers=8
)

```

**Key implementation details:**
- **Per-scale NMS**: Each slicer instance applies non-maximum suppression independently before returning results
- **Global merge**: `Detections.merge()` concatenates bounding boxes, confidence scores, and class IDs from all scales into a single object
- **Cross-scale NMS**: The final `with_nms()` call eliminates duplicate detections that appear across different resolutions

## Alternative Approach: Image Pyramids vs. Tile Scaling

Instead of varying tile sizes, you can maintain a fixed tile size while scaling the input image itself. This approach requires rescaling detection coordinates back to the original resolution after inference.

```python
def multi_scale_image_pyramid(image, scale_factors, slice_wh=640):
    per_scale = []
    
    for factor in scale_factors:
        # Scale down image

        scaled = sv.utils.image.resize_image(image, scale=factor)
        
        slicer = sv.InferenceSlicer(
            callback=tile_callback,
            slice_wh=(slice_wh, slice_wh),
            overlap_wh=(100, 100),
            thread_workers=4,
        )
        detections = slicer(scaled)
        
        # Rescale coordinates back to original image size

        # Manual implementation if rescale method unavailable:

        detections.xyxy = detections.xyxy * (1 / factor)
        if detections.mask is not None:
            # Rescale masks similarly

            pass
            
        per_scale.append(detections)
    
    return sv.Detections.merge(per_scale).with_nms(0.5, sv.OverlapMetric.IOU)

```

This method processes the same number of tiles per scale but varies the effective receptive field relative to the original image resolution.

## Critical Configuration Parameters

### **slice_wh** (Tuple[int, int])

Defines the tile dimensions `(width, height)`. For multi-resolution workflows, specify progressive sizes (e.g., 640→960→1280) to balance detail capture against computational cost.

### **overlap_wh** (Tuple[int, int])

Controls pixel overlap between adjacent tiles. Setting this to approximately 20-25% of `slice_wh` prevents objects from being truncated at tile boundaries. The source code in `_generate_offset()` calculates stride as `slice_wh - overlap_wh`.

### **thread_workers** (int)

Specifies parallel workers for tile processing. Set to `-1` to use all available CPU cores, or `0`/`1` for sequential execution when debugging or working with GPU memory-constrained callbacks.

### **overlap_filter** (OverlapFilter)

Determines how duplicates in overlapping regions are resolved:
- **NON_MAX_SUPPRESSION**: Standard NMS keeping highest confidence box
- **NON_MAX_MERGE**: Merges box coordinates and confidence scores
- **NONE**: Retains all detections (useful when your callback already handles overlaps)

## Summary

- The `InferenceSlicer` class in [`src/supervision/detection/tools/inference_slicer.py`](https://github.com/roboflow/supervision/blob/main/src/supervision/detection/tools/inference_slicer.py) implements sliding-window detection via `_generate_offset()` for tile creation and `move_detections()` for coordinate correction
- Implement multi-resolution detection by instantiating multiple slicers with different `slice_wh` values and merging results with `Detections.merge()`
- Configure `overlap_wh` to prevent boundary artifacts and `thread_workers` to optimize throughput
- Apply a final `with_nms()` call after merging cross-scale results to eliminate duplicate detections

## Frequently Asked Questions

### What is the difference between `slice_wh` and `overlap_wh`?

The `slice_wh` parameter defines the dimensions of each tile extracted from the image, while `overlap_wh` specifies how many pixels adjacent tiles should share. According to the source code in [`inference_slicer.py`](https://github.com/roboflow/supervision/blob/main/inference_slicer.py), the stride between tile starts is calculated as `slice_wh - overlap_wh`. Larger overlaps reduce the risk of missing objects at tile boundaries but increase computational overhead.

### How does InferenceSlicer handle detections at tile boundaries?

The slicer uses `move_detections()` (defined in lines 22-52 of [`inference_slicer.py`](https://github.com/roboflow/supervision/blob/main/inference_slicer.py)) to translate tile-local coordinates back to the global image space by adding the tile's `(x_min, y_min)` offset. When `overlap_filter` is enabled, the slicer applies NMS or NMM algorithms to resolve detections that appear in multiple overlapping tiles, using the `iou_threshold` parameter to determine merge criteria.

### Can I use InferenceSlicer with instance segmentation models?

Yes. The `callback` function can return `Detections` objects containing masks, and the slicer will handle coordinate transformation for both bounding boxes and segmentation masks. The `move_masks()` function in [`src/supervision/detection/utils/masks.py`](https://github.com/roboflow/supervision/blob/main/src/supervision/detection/utils/masks.py) handles spatial translation of mask arrays, ensuring that pixel-accurate masks align correctly when merged back to the full-resolution image.

### What overlap filter should I use for multi-resolution detection?

For multi-resolution pipelines, use `OverlapFilter.NON_MAX_SUPPRESSION` with an `iou_threshold` of 0.5 during the per-scale slicing phase to clean up tile boundary duplicates. After merging all scales with `Detections.merge()`, apply a second NMS pass using `with_nms()` to handle duplicates that appear across different resolutions. This two-stage filtering prevents the same object detected at different scales from appearing multiple times in final results.