# How to Use pandas explode to Transform List Columns into Rows

> Easily transform pandas list columns into rows with DataFrame explode. Learn how to vectorize expansion and manage your index efficiently.

- Repository: [pandas/pandas](https://github.com/pandas-dev/pandas)
- Tags: how-to-guide
- Published: 2026-02-16

---

**Use `DataFrame.explode()` to expand list-like elements into separate rows, delegating to `Series.explode()` for vectorized expansion while preserving or resetting the index via the `ignore_index` parameter.**

The `pandas explode` operation is essential for normalizing semi-structured data where columns contain lists or arrays. According to the pandas-dev/pandas source code, this method efficiently transforms each element of a list-like column into its own row without Python-level loops, leveraging optimized NumPy or Arrow buffer operations under the hood.

## How pandas explode Works Internally

### Entry Point in DataFrame.explode

In [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py) at line 13177, `DataFrame.explode` serves as the high-level entry point. This method validates that the target columns exist and are compatible with explosion—checking for object, list-like, or extension array dtypes. It orchestrates the transformation by calling `Series.explode` on each specified column.

### Core Logic in Series.explode

The heavy lifting occurs in [`pandas/core/series.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/series.py) at line 4531 within `Series.explode`. This implementation expands each list-like element while preserving the original index for each generated row. When `ignore_index=True` is passed, it generates a new monotonic integer index instead of preserving the original.

### Arrow Optimization for Large Datasets

For Arrow extension arrays, the implementation in [`pandas/core/arrays/arrow/accessors.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/arrays/arrow/accessors.py) at line 463 provides a specialized path. This delegates to Arrow's native vectorized `explode` operations, which process data directly on Arrow buffers without converting to NumPy, yielding significant performance gains for large datasets.

## Practical Examples of pandas explode

### Basic Explosion of List-like Columns

Use `explode` on a column containing Python lists to create separate rows for each element:

```python
import pandas as pd

df = pd.DataFrame(
    {
        "id": [1, 2, 3],
        "tags": [["a", "b"], ["c"], []],
    }
)

exploded = df.explode("tags")
print(exploded)

```

Output:

```

   id tags
0   1    a
0   1    b
1   2    c
2   3  <NA>

```

Notice that empty lists produce `NA` values, and the original index `0` is duplicated for both `"a"` and `"b"`.

### Resetting Index with ignore_index

When you need a clean integer index after explosion, pass `ignore_index=True`:

```python
exploded_reset = df.explode("tags", ignore_index=True)
print(exploded_reset)

```

Output:

```

   id tags
0   1    a
1   1    b
2   2    c
3   3  <NA>

```

This generates a new monotonic index from 0 to N-1, eliminating duplicate index values.

### Exploding Multiple Columns Simultaneously

Since pandas 1.3, you can explode multiple columns at once by passing a list of column names. The method aligns elements by position within each row:

```python
df_multi = pd.DataFrame(
    {
        "id": [1, 2],
        "colors": [["red", "blue"], ["green"]],
        "shapes": [["circle"], ["square", "triangle"]],
    }
)

exploded_multi = df_multi.explode(["colors", "shapes"])
print(exploded_multi)

```

Output:

```

   id colors   shapes
0   1    red    circle
0   1   blue    circle
1   2  green    square
1   2  green  triangle

```

When lists have unequal lengths, the shorter list is padded with `NA` to match the longer list's length.

### High-Performance Explosion with Arrow Extension Arrays

For large datasets, use Arrow-backed extension arrays to leverage vectorized buffer operations:

```python
arrow_series = pd.arrays.ArrowExtensionArray(pd.array([[1, 2], [3], None]))
df_arrow = pd.DataFrame({"values": arrow_series})

exploded_arrow = df_arrow.explode("values")
print(exploded_arrow)

```

Output:

```

   values
0       1
0       2
1       3
2    <NA>

```

This path uses the implementation in [`pandas/core/arrays/arrow/accessors.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/arrays/arrow/accessors.py), avoiding Python loops entirely by delegating to Arrow's native explode functionality.

## Performance Considerations and Internal Mechanics

The `pandas explode` implementation avoids Python-level iteration through several optimization strategies:

- **Buffer Operations**: For standard NumPy-backed object arrays, the method operates directly on underlying buffers to expand list-like elements without explicit Python loops.
- **Index Handling**: The reconstruction phase in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py) efficiently duplicates other columns to match the new row count using vectorized operations.
- **Arrow Vectorization**: When working with Arrow extension arrays, the operation delegates to [`pandas/core/arrays/arrow/accessors.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/arrays/arrow/accessors.py) at line 463, utilizing Arrow's native vectorized explode operations for significant performance gains on large datasets.

These mechanisms handle edge cases such as empty lists (producing `NA`), scalar values (treated as single-element lists), and mixed-type elements without requiring manual data cleaning.

## Summary

- **`DataFrame.explode`** in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py) is the primary interface for transforming list-like column elements into separate rows.
- The operation delegates to **`Series.explode`** in [`pandas/core/series.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/series.py) for the actual expansion logic, preserving indices by default or generating new ones with `ignore_index=True`.
- **Multiple columns** can be exploded simultaneously since pandas 1.3 by passing a list of column names.
- **Arrow extension arrays** provide optimized performance via [`pandas/core/arrays/arrow/accessors.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/arrays/arrow/accessors.py), leveraging native vectorized operations for large datasets.
- The implementation handles empty lists, `NA` values, and scalar elements automatically without Python loops.

## Frequently Asked Questions

### What is the difference between pandas explode and manual iteration?

**`pandas explode`** operates directly on underlying NumPy or Arrow buffers without Python-level loops, while manual iteration using `apply` or list comprehensions creates Python objects for each element and typically reconstructs the DataFrame iteratively. According to the pandas source code in [`pandas/core/series.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/series.py), the vectorized implementation avoids the overhead of Python iteration and handles index alignment automatically, making it significantly faster for large datasets.

### How does pandas explode handle empty lists or NaN values?

The implementation in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py) and [`pandas/core/series.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/series.py) treats empty lists as producing a single `NA` value in the exploded output, preserving the row with missing data rather than dropping it. Scalar values (non-list elements) are treated as single-element lists and remain unchanged in their own row. This behavior ensures that no data is lost during the transformation, maintaining row alignment with the original DataFrame's other columns.

### Can I explode multiple columns at once in pandas?

Yes, since pandas 1.3, **`DataFrame.explode`** accepts a list of column names, exploding them simultaneously while aligning elements by their positional index within each row. As implemented in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py), when lists have unequal lengths, the shorter list is padded with `NA` values to match the length of the longest list in that row, ensuring consistent row counts across all exploded columns.

### Is pandas explode efficient for large datasets?

**`pandas explode`** is highly efficient for large datasets because it avoids Python loops by operating directly on NumPy buffers or, for Arrow extension arrays, delegating to the native vectorized implementation in [`pandas/core/arrays/arrow/accessors.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/arrays/arrow/accessors.py). The Arrow path at line 463 provides significant performance advantages for large datasets by processing data in columnar buffers without conversion to Python objects, making it the preferred approach when working with millions of rows.