How to Access the Next Row in pandas iterrows: 4 Methods Explained
You cannot access the next row directly from the iterrows generator, but you can materialize the iterator into a list for indexing, combine enumerate with iloc, or use vectorized shift operations to avoid Python loops entirely.
The iterrows() method in the pandas-dev/pandas repository yields each DataFrame row as an (index, Series) pair, making it a common choice for row-wise operations. However, because the implementation in pandas/core/frame.py simply iterates over zip(self.index, self.values) without buffering future rows, the iterator itself cannot look ahead. To access the next row during iteration, you must implement one of the following patterns that work with the underlying architecture.
Why iterrows Cannot Access the Next Row Directly
The iterrows method is defined in pandas/core/frame.py and implements a straightforward generator pattern:
def iterrows(self) -> Iterable[tuple[Hashable, Series]]:
columns = self.columns
klass = self._constructor_sliced
for k, v in zip(self.index, self.values, strict=True):
s = klass(v, index=columns, name=k).__finalize__(self)
if self._mgr.is_single_block:
s._mgr.add_references(self._mgr)
yield k, s
Because the method yields k, s immediately without maintaining a reference to the subsequent value in self.values, the Python generator protocol does not support peeking or looking ahead. Any solution must therefore either materialize the sequence or use positional indexing via iloc.
Method 1: Materialize the Iterator with list()
Converting the generator to a list allows standard Python indexing to access the next element. This approach is simple but consumes memory proportional to the DataFrame size.
import pandas as pd
df = pd.DataFrame({
'city': ['A', 'B', 'C', 'D'],
'pop': [100, 200, 150, 300]
})
rows = list(df.iterrows()) # materialize all (index, Series) pairs
for i, (idx, row) in enumerate(rows):
next_row = rows[i + 1][1] if i + 1 < len(rows) else None
current_city = row['city']
next_city = next_row['city'] if next_row is not None else None
print(f"Current: {current_city}, Next: {next_city}")
Why it works: list(df.iterrows()) builds a Python list of the yielded tuples, enabling O(1) indexing to retrieve the next row.
Method 2: Use enumerate() with iloc
Using enumerate to track the integer position allows you to fetch the next row via iloc without materializing the entire iterator. This balances readability with memory efficiency.
for i, (idx, row) in enumerate(df.iterrows()):
try:
next_row = df.iloc[i + 1] # returns Series for next row
except IndexError:
next_row = None # handle last row
current_pop = row['pop']
next_pop = next_row['pop'] if next_row is not None else 0
diff = next_pop - current_pop if next_row is not None else 0
print(f"Row {i}: pop={current_pop}, next_pop={next_pop}, diff={diff}")
Why it works: enumerate supplies the integer position i that aligns with DataFrame.iloc. Because iterrows iterates over self.values in order, the positional index i corresponds exactly to the row yielded at that step.
Method 3: Vectorized Look-Ahead with shift()
For performance-critical code, avoid Python loops entirely. The shift() method creates a look-ahead column using C-level operations, which is significantly faster than iterating.
# Create a look-ahead column without iterating
df['next_pop'] = df['pop'].shift(-1)
# Now iterate only if you need row-wise logic, with next value already available
for idx, row in df.iterrows():
current = row['pop']
nxt = row['next_pop'] # NaN for last row
if pd.notna(nxt):
print(f"Current: {current}, Next: {nxt}, Growth: {nxt - current}")
Why it works: Series.shift(-1) moves values up by one position, placing the next row's value in the current row. This leverages the underlying NumPy array operations in pandas/core/ops.py rather than Python iteration.
Method 4: Fast Iteration with itertuples()
If you must iterate but want better performance than iterrows, use itertuples(). It returns lightweight namedtuples instead of Series objects, and you can apply the same list-materialization pattern for look-ahead.
rows = list(df.itertuples(index=False, name='Row'))
for i, cur in enumerate(rows):
nxt = rows[i + 1] if i + 1 < len(rows) else None
current_city = cur.city
next_city = nxt.city if nxt else None
print(f"Current: {current_city}, Next: {next_city}")
Why it works: itertuples is implemented in the same pandas/core/frame.py file but avoids the overhead of constructing a Series for every row. Converting to a list allows indexing for the next row while maintaining better performance than iterrows.
Summary
iterrowsyields(index, Series)pairs sequentially without buffering, making native look-ahead impossible.- Materializing the iterator with
list(df.iterrows())enables simple indexing to get the next row, at the cost of memory. enumeratewithilocprovides a memory-efficient alternative that uses positional indexing to fetch the next row.- Vectorized
shiftis the most performant solution for column-wise look-ahead, avoiding Python loops entirely. itertuplesoffers a faster iteration mechanism thaniterrowswhen you must materialize rows for look-ahead operations.
Frequently Asked Questions
Why can't I use next() on the iterrows generator inside the loop?
The iterrows method returns a generator that yields one row at a time. Once you enter a for loop over that generator, the iterator advances automatically. While you could theoretically call next() on a manually instantiated iterator, this complicates the loop structure and risks StopIteration errors. The patterns shown above—materializing to a list or using iloc—are safer and more readable.
Is using iloc inside an iterrows loop inefficient?
Using df.iloc[i + 1] inside an iterrows loop incurs indexing overhead for each iteration, but it avoids the memory cost of materializing the entire DataFrame as a list. For small to medium DataFrames, the performance difference is negligible. For large DataFrames, you should prefer vectorized shift operations or itertuples to minimize both memory and CPU overhead.
Does iterrows preserve data types when accessing the next row?
iterrows converts each row to a Series, which can cause dtype coercion because the Series must hold a single dtype per column. When you access the next row—whether via list indexing or iloc—you receive a Series that may have converted integers to floats or other types to accommodate NaN values. itertuples preserves original dtypes better because it returns namedtuples rather than Series objects.
What is the most Pythonic way to compare current and next rows?
For readability and performance, the most Pythonic approach is to use vectorized operations with shift to create a comparison column, then filter or calculate differences without explicit loops. If you must iterate, materializing itertuples to a list and using enumerate for indexing produces cleaner code than managing manual next() calls or exception handling inside the loop.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →