How to Use pandas replace values in column: A Complete Guide to DataFrame.replace()
Use DataFrame.replace() to substitute values in pandas columns using scalars, lists, dictionaries, or regular expressions, with optional in-place modification.
The pandas replace values in column functionality is implemented in the pandas-dev/pandas repository as a high-level API on the NDFrame base class. This method provides a flexible interface for value substitution that works across both Series and DataFrame objects, handling everything from simple scalar replacement to complex regex pattern matching.
How pandas replace values in column Works Under the Hood
The Call Stack from NDFrame to Block Manager
When you call df.replace(), the execution flows through several layers of the pandas architecture:
-
Public API Entry: The method is defined in
pandas/core/generic.pyon theNDFrameclass, which serves as the base for bothDataFrameandSeries. -
Argument Validation: The
replacemethod validates parameters (to_replace,value,inplace,regex) and dispatches to the internal block manager viaself._mgr.replace(...). -
Block Manager Logic: In
pandas/core/internals/managers.py, theSingleBlockManager.replacemethod asserts that inputs are not list-like (higher-level logic handles lists) and callsself.apply("replace", ...)to route to the appropriate block implementation. -
Block Implementation: Each block type (numeric, object, datetime) implements
_replaceroutines that perform element-wise substitution on the underlying NumPy arrays, or invokere.subfor regex operations.
pandas replace values in column Syntax and Parameters
The DataFrame.replace() method signature supports multiple patterns for the to_replace parameter:
- Scalar: Replace a single value with another scalar
- List-like: Replace multiple values with one value, or map list-to-list
- Dictionary: Map old values to new values (keys to values)
- Nested Dictionary: Column-specific replacement (
{column: {old: new}}) - Regular Expressions: Pattern matching with
regex=True
Key parameters include inplace (modify original vs. return copy), limit (maximum replacements per row), and method (fill method for pad/backfill).
Practical Examples of pandas replace values in column
Scalar Replacement
Replace all occurrences of a specific value across the entire DataFrame:
import pandas as pd
df = pd.DataFrame({"A": [0, 1, 2], "B": ["a", "b", "a"]})
df.replace(0, 99)
Output:
A B
0 99 a
1 1 b
2 2 a
List-Like Replacement
Replace multiple distinct values with a single replacement value:
df.replace([0, 1, 2], 9)
Or map each value to a specific replacement using parallel lists:
df.replace([0, 1, 2], [9, 8, 7])
Dictionary-Based Replacement
Use a dictionary to map old values to new values across all columns:
df.replace({0: 100, "a": "alpha"})
This replaces 0 with 100 and "a" with "alpha" wherever they appear.
Column-Specific Replacement
Target specific columns using a nested dictionary structure:
df.replace({"A": {0: -1, 2: -2}})
Only column A is modified, replacing 0 with -1 and 2 with -2.
Regex Pattern Replacement
Replace values matching regular expressions by setting regex=True:
df.replace(to_replace=r"(?i)a", value="X", regex=True)
This case-insensitive pattern replaces both "a" and "A" with "X".
In-Place Modification
Modify the DataFrame without creating a copy:
df.replace(0, 999, inplace=True)
When inplace=True, the method returns None and modifies df directly.
Summary
DataFrame.replace()is implemented inpandas/core/generic.pyon theNDFramebase class and dispatches topandas/core/internals/managers.pyfor block-level operations.- The method supports scalar, list-like, dictionary, and regex replacement patterns with optional column-specific targeting.
- Use
inplace=Trueto modify DataFrames without memory overhead, or omit it to return a new copy. - Regex replacement requires
regex=Trueand only affects string-compatible data types.
Frequently Asked Questions
What is the difference between replace() and fillna() in pandas?
replace() substitutes specific values with new values based on exact matches or patterns, while fillna() specifically targets missing values (NaN, None, NaT). Use replace() when you need to swap existing valid data (like replacing 0 with -1 or "old" with "new"), and reserve fillna() for handling null value imputation.
Can I use regular expressions with pandas replace values in column?
Yes, pass regex=True to enable pattern matching. You can use raw strings with regex syntax in the to_replace parameter, such as df.replace(to_replace=r"^A.*", value="StartsWithA", regex=True). This routes through the block manager's regex-specific replacement path in pandas/core/internals/managers.py, applying re.sub only to string-dtype blocks.
How do I replace values only in specific columns using pandas?
Use a nested dictionary where the top-level keys are column names and the values are dictionaries mapping old values to new ones. For example: df.replace({"ColumnA": {1: 100}, "ColumnB": {"x": "y"}}). This pattern is processed in pandas/core/generic.py before dispatching to the block manager, ensuring only the specified columns are modified.
Does pandas replace modify the DataFrame in-place?
Only if you specify inplace=True. By default (inplace=False), replace() returns a new DataFrame with substitutions applied, leaving the original unchanged. When inplace=True, the method returns None and modifies the underlying blocks directly through the manager's in-place replacement logic in pandas/core/internals/managers.py.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →