How to Replace NaN with 0 in pandas: The Complete Guide to DataFrame.fillna()
Use DataFrame.fillna(0) to replace all NaN values with 0 efficiently, leveraging pandas' vectorized Cython implementation that operates directly on the underlying block manager without Python loops.
When working with the pandas-dev/pandas repository, handling missing data is a common task. The most efficient way to replace nan with 0 in pandas is through the fillna method, which routes to a high-performance implementation in pandas/core/generic.py and delegates the heavy lifting to the block manager in Cython.
Why Use DataFrame.fillna to Replace NaN with 0 in pandas?
The fillna method is the canonical approach for constant imputation because it avoids Python-level iteration. According to the source code in pandas/core/generic.py (starting at line 6935), the method validates input parameters, normalizes axis alignment, and dispatches to self._mgr.fillna. This block manager operation is implemented in Cython, making it memory-efficient and vectorized across millions of rows.
How DataFrame.fillna Works Under the Hood
Entry Point and Validation in pandas/core/generic.py
The public API entry point resides in pandas/core/generic.py. The implementation first validates that the value parameter is a scalar, dict, Series, or DataFrame. If a list or tuple is passed, the code raises a TypeError at line 6958:
if isinstance(value, (list, tuple)):
raise TypeError(...)
Axis handling occurs around line 6666, where the method normalizes the axis parameter using _get_axis_number, defaulting to axis 0 (rows) when none is specified.
The Block Manager Optimization
The actual replacement logic is delegated to the internal block manager at line 7890 in pandas/core/generic.py:
new_data = self._mgr.fillna(value=value, limit=limit, inplace=inplace)
This call routes to pandas/core/internals/managers.py, where the fill operation is executed at the Cython level. The block manager operates directly on contiguous memory blocks, avoiding the overhead of Python object iteration. This architecture ensures that when you replace nan with 0 in pandas, the operation runs in near-native speed regardless of DataFrame size.
Practical Examples: Replace NaN with 0 in pandas
Basic Scalar Replacement
The most common use case replaces every NaN with 0 across the entire DataFrame. This creates a new object by default:
import pandas as pd
import numpy as np
df = pd.DataFrame({
"A": [np.nan, 2, np.nan, 4],
"B": [1, np.nan, np.nan, 8],
"C": [np.nan, np.nan, 3, np.nan],
})
# Replace all NaN with 0 (returns new DataFrame)
df_filled = df.fillna(0)
print(df_filled)
In-Place Replacement for Memory Efficiency
For large datasets, avoid copying data by using the inplace=True parameter. This modifies the original DataFrame and returns None:
# Modify df directly without creating a copy
df.fillna(0, inplace=True)
print(df)
According to the source code in pandas/core/generic.py (line 7932), when inplace=True, the method finalizes the operation on the existing object rather than constructing a new one via result.__finalize__(self, method="fillna").
Column-Specific Replacement Values
You can replace nan with 0 in pandas selectively by passing a dictionary mapping column names to fill values. The implementation at line 7070 in pandas/core/generic.py converts this dict to a Series and reindexes it to match the DataFrame columns:
df2 = pd.DataFrame({
"A": [np.nan, 2, np.nan, 4],
"B": [1, np.nan, np.nan, 8],
"C": [np.nan, np.nan, 3, np.nan],
})
# Different fill values per column
df2_filled = df2.fillna({"A": -1, "B": 0, "C": 99})
print(df2_filled)
Limiting the Number of Replacements
The limit parameter restricts how many NaN values are replaced per column, useful for forward-filling scenarios or partial imputation:
df3 = pd.DataFrame({
"A": [np.nan, np.nan, np.nan, 4],
"B": [np.nan, 1, np.nan, 2]
})
# Only replace the first NaN in each column
df3_filled = df3.fillna(0, limit=1)
print(df3_filled)
This parameter is passed directly to self._mgr.fillna at line 7890 in pandas/core/generic.py, where the block manager enforces the limit during the Cython-level iteration.
Summary
- Use
DataFrame.fillna(0)as the primary method to replace nan with 0 in pandas efficiently. - The implementation resides in
pandas/core/generic.py(line 6935) and delegates to the Cython-optimized block manager inpandas/core/internals/managers.py. - Vectorized operations occur at the block level, avoiding slow Python loops.
- In-place operations (
inplace=True) save memory for large datasets. - Flexible value specification supports scalars, dictionaries, and Series for column-specific filling.
Frequently Asked Questions
What is the fastest way to replace NaN with 0 in pandas?
The fastest approach is df.fillna(0), which executes at Cython speed via the block manager in pandas/core/internals/managers.py. This method processes contiguous memory blocks without Python iteration, making it orders of magnitude faster than looping with df.apply or df.replace.
Does fillna modify the original DataFrame?
By default, fillna returns a new DataFrame and leaves the original unchanged. However, if you pass inplace=True, the method modifies the existing object directly. According to the source code in pandas/core/generic.py (line 7932), the inplace parameter determines whether the operation finalizes a copy or the original block manager data.
Can I replace NaN with different values for different columns?
Yes, pass a dictionary to fillna where keys are column names and values are the replacement constants. For example, df.fillna({"A": 0, "B": -1}) replaces NaNs in column A with 0 and in column B with -1. The implementation in pandas/core/generic.py (line 7070) converts this dictionary to a Series and aligns it to the DataFrame index before filling.
Is fillna better than replace for NaN values?
For scalar replacement of NaN values, fillna is preferred over replace because it is specifically optimized for missing data imputation. While df.replace(np.nan, 0) works, df.fillna(0) routes directly to the block manager's optimized fill logic in pandas/core/internals/managers.py, avoiding the overhead of the general replacement matching algorithm.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →