How to Set Index in pandas and Access or Modify a Specific Cell
Use DataFrame.set_index() to define row labels, then access or modify individual cells with .at[] and .iat[] for fast scalar lookups, or .loc[] and .iloc[] for flexible label- and position-based indexing.
Setting the index in pandas is the first step toward intuitive data manipulation, allowing you to reference rows by meaningful labels rather than integer positions. Once the index is established, the library provides four specialized accessors in pandas/core/indexing.py to read or write single values efficiently. This guide demonstrates how to configure the index and use each accessor based on the actual implementation in the pandas-dev/pandas repository.
Setting the Index in pandas
Before accessing cells by label, you must define the index. The DataFrame.set_index() method, implemented in pandas/core/frame.py, converts existing columns into row labels.
import pandas as pd
df = pd.DataFrame({
"id": [100, 101, 102],
"value": [10, 20, 30]
})
# Set the 'id' column as the index
df = df.set_index("id")
You can also replace the index directly by assigning to df.index:
df.index = pd.Index(["a", "b", "c"])
Both approaches update the Index object stored in the DataFrame, enabling label-based lookups via the _LocIndexer and _AtIndexer classes defined in pandas/core/indexing.py.
Accessing and Modifying Single Cells
pandas provides four primary accessors for cell-level operations. The implementations reside in pandas/core/indexing.py, where _LocIndexer (line 303) handles .loc[] and _AtIndexer (line 633) handles .at[]. Their positional counterparts, _iLocIndexer and _iAtIndexer, manage .iloc[] and .iat[].
Label-Based Access with loc and at
.loc[] supports label-based indexing for scalars, slices, and boolean arrays. For single-cell access, pass the row label and column name:
# Read a cell
value = df.loc["row2", "B"] # Returns scalar
# Modify a cell
df.loc["row2", "B"] = 55
.at[] provides optimized scalar access by bypassing the generic slicing logic in _LocIndexer. According to the source in pandas/core/indexing.py, _AtIndexer calls fast C-extension paths for get_value and set_value operations:
# Fast read
value = df.at["row3", "A"]
# Fast write
df.at["row3", "A"] = 35
Position-Based Access with iloc and iat
When working with integer positions rather than labels, use .iloc[] and .iat[]. These rely on _iLocIndexer and _iAtIndexer in pandas/core/indexing.py.
.iloc[] accepts integer indices for rows and columns:
# Read first row, second column
value = df.iloc[0, 1]
# Modify
df.iloc[0, 1] = 45
.iat[] offers the same performance optimization as .at[] but for positional indexing:
# Fast positional read
value = df.iat[2, 0]
# Fast positional write
df.iat[2, 0] = 32
Performance Considerations
The pandas source code in pandas/core/indexing.py distinguishes between general-purpose indexers (loc, iloc) and scalar-optimized indexers (at, iat). When you need to read or write a single value, prefer .at[] or .iat[] because they:
- Skip the overhead of slice parsing and boolean mask handling
- Invoke direct
__getitem__and__setitem__paths on the underlying arrays - Avoid creating intermediate Series or DataFrame objects
For bulk operations or conditional updates, use .loc[] or .iloc[] to leverage vectorized assignments.
Summary
- Set the index with
DataFrame.set_index()or direct assignment todf.indexto enable label-based lookups. - Access cells by label using
.loc[]for flexible indexing or.at[]for fast scalar access, both implemented inpandas/core/indexing.py. - Access cells by position using
.iloc[]for flexible indexing or.iat[]for fast scalar access. - Modify cells by assigning values directly to any indexer expression; changes occur in-place without requiring reassignment.
Frequently Asked Questions
What is the difference between .loc[] and .at[] in pandas?
.loc[] is a general-purpose label-based indexer that accepts slices, lists, and boolean masks to return Series or DataFrame subsets. .at[] is optimized specifically for scalar values—accessing or setting a single cell by label. According to the implementation in pandas/core/indexing.py, .at[] bypasses the slicing overhead of .loc[] to provide faster performance when you only need one value.
How do I change a single value in a pandas DataFrame without using .loc[]?
You can use .at[] for label-based scalar assignment or .iat[] for position-based scalar assignment. For example, df.at['row_label', 'col_name'] = new_value modifies the cell directly. These accessors are defined in pandas/core/indexing.py and are the fastest way to update individual cells because they avoid the intermediate object creation that occurs with .loc[] or .iloc[].
Is it better to use integer positions or labels when modifying DataFrame cells?
Use labels (via .loc[] or .at[]) when your DataFrame has a meaningful index, such as unique IDs or timestamps, because your code remains readable and robust against row reordering. Use integer positions (via .iloc[] or .iat[]) only when you explicitly need to reference the nth row or mth column regardless of their labels. The choice depends on data stability—labels survive sorting and filtering, while positions do not.
Why does my code run slower when using .loc[] inside a loop?
.loc[] performs extensive validation, slice parsing, and alignment checks to handle diverse input types (slices, lists, boolean arrays). When iterating through thousands of rows, this overhead accumulates. For loop-based scalar access, switch to .at[] or .iat[], which are implemented in pandas/core/indexing.py as lightweight wrappers that directly call the underlying array's __getitem__ and __setitem__ methods without the generic indexing machinery.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →