# How to Merge Pandas DataFrames Where a Value Falls Between Two Values

> Merge Pandas DataFrames by value range. Learn to use pd.IntervalIndex for efficient conditional merges based on value 
between boundaries. Optimize your data integration.

- Repository: [pandas/pandas](https://github.com/pandas-dev/pandas)
- Tags: how-to-guide
- Published: 2026-02-20

---

**Use `pd.IntervalIndex` to construct half-open intervals from the boundary columns of the right DataFrame, map each value in the left DataFrame to its containing interval via `get_indexer`, and finish with a standard `pd.merge` on the resulting integer positions.**

The pandas-dev/pandas repository provides powerful indexing structures that enable range-based joins without dedicated SQL-style `BETWEEN` operators. By leveraging the `IntervalIndex` class implemented in [`pandas/core/indexes/interval.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/indexes/interval.py) and the generic merge engine in [`pandas/core/reshape/merge.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/reshape/merge.py), you can efficiently match rows where a scalar value falls inside an interval defined by two columns.

## Why Standard Merge Falls Short

Standard equality-based merging via `pd.merge` requires exact key matches. When you need to join a DataFrame of events (with a single timestamp or value) to a DataFrame of sessions (defined by start and end boundaries), equality joins fail because the event value lies somewhere *between* the boundary columns rather than equaling them.

## Implementing an Interval Join with IntervalIndex

The most robust approach uses `IntervalIndex` to represent the ranges and vectorized indexing to locate matches.

### Step 1: Construct the IntervalIndex

First, convert the start and end columns of your right-hand DataFrame into an `IntervalIndex`. According to the pandas source in [`pandas/core/indexes/interval.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/indexes/interval.py), the `from_arrays` constructor creates an index of half-open intervals suitable for fast containment checks.

```python
import pandas as pd

# Right DataFrame: each row defines a range [start, end)

ranges = pd.DataFrame({
    "start": [0, 10, 20, 30],
    "end": [10, 20, 30, 40],
    "session": ["A", "B", "C", "D"]
})

# Build IntervalIndex (closed='right' means start < x <= end)

ranges["interval"] = pd.IntervalIndex.from_arrays(
    ranges["start"], ranges["end"], closed="right"
)

```

### Step 2: Map Left Values to Interval Positions

Use the `get_indexer` method to find the integer position of the interval that contains each value from the left DataFrame. This method is implemented in the interval index logic and returns `-1` for values outside all intervals.

```python

# Left DataFrame: observations with a single value

obs = pd.DataFrame({
    "value": [2, 15, 25, 35, 45],
    "event": ["e1", "e2", "e3", "e4", "e5"]
})

# Find which interval each value belongs to

idx = ranges["interval"].get_indexer(obs["value"])
obs["interval_idx"] = idx

# Filter out values that don't fall in any interval

matched = obs[obs["interval_idx"] != -1].copy()

```

### Step 3: Execute the Merge

Finally, perform a standard merge on the integer positions. The generic `merge` implementation in [`pandas/core/reshape/merge.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/reshape/merge.py) handles the join logic once the keys are aligned.

```python
result = matched.merge(
    ranges,
    left_on="interval_idx",
    right_index=True,
    how="left",
    suffixes=("", "_range")
)

print(result[["value", "event", "session"]])

```

**Output:**

```

   value event session
0      2    e1       A
1     15    e2       B
2     25    e3       C
3     35    e4       D

```

## Alternative: Binning with pd.cut

If you prefer a categorical approach, `pd.cut` bins values according to the intervals and attaches the labels automatically. This utilizes the same `IntervalIndex` machinery under the hood via [`pandas/core/arrays/interval.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/arrays/interval.py).

```python

# Create the interval index for binning

bins = pd.IntervalIndex.from_arrays(ranges["start"], ranges["end"], closed="right")

# Assign each observation to a bin (returns NaN if outside)

obs["session"] = pd.cut(obs["value"], bins=bins, labels=ranges["session"])

# Drop unmapped values and merge on the label

result = obs.dropna(subset=["session"]).merge(ranges, on="session")

```

This approach is concise but creates a categorical column rather than integer indices, which may be preferable for readability in downstream analysis.

## Performance Considerations

The `IntervalIndex.get_indexer` method executes a vectorized search sorted algorithm, yielding **O(n log n)** complexity for the mapping step, followed by an efficient hash-based merge. For extremely large datasets (millions of intervals), ensure the right-hand DataFrame's interval index is monotonic to leverage optimized search paths inside [`pandas/core/indexes/interval.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/indexes/interval.py).

## Summary

- **`IntervalIndex`** stored in [`pandas/core/indexes/interval.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/indexes/interval.py) provides the foundation for range-based lookups.
- **`get_indexer`** translates scalar values into interval positions without explicit loops.
- **Standard `pd.merge`** in [`pandas/core/reshape/merge.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/reshape/merge.py) completes the join once keys are aligned.
- **`pd.cut`** offers a high-level alternative using the same interval arithmetic.

## Frequently Asked Questions

### How do I handle overlapping intervals in the right DataFrame?

When intervals overlap, `IntervalIndex.get_indexer` returns the position of the first matching interval it encounters. If you need to match all overlapping intervals rather than just the first, use `pd.IntervalIndex.get_indexer_non_unique` or explode the result after merging.

### Can I use merge_asof instead of IntervalIndex?

`pd.merge_asof` matches on the nearest key rather than checking containment within a range. It works well for temporal "as-of" joins but cannot directly test whether a value falls between two arbitrary bounds. Stick with `IntervalIndex` for true between-style joins.

### What is the difference between closed='left' and closed='right'?

The `closed` parameter in `pd.IntervalIndex.from_arrays` determines whether interval boundaries are inclusive. `closed='right'` includes the right bound but excludes the left (start < x <= end), while `closed='left'` includes the left bound but excludes the right (start <= x < end). Choose the setting that matches your business logic for boundary conditions.

### Does this approach work with datetime intervals?

Yes. `IntervalIndex` supports any comparable dtype, including `datetime64[ns]`. Simply pass datetime arrays to `from_arrays` and ensure your left-hand values are datetime-compatible. The underlying mechanics in [`pandas/core/arrays/interval.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/arrays/interval.py) handle the comparison logic generically.