Difference Between loc and iloc in Pandas: Label vs Position Indexing
The primary difference between loc and iloc in pandas is that loc selects data by label (index and column names) while iloc selects data by integer position (0-based numerical indices).
The difference between loc and iloc in pandas represents one of the most fundamental concepts for data manipulation in Python. While both indexers retrieve subsets from DataFrames and Series, they operate on entirely different referencing systems—one using explicit labels, the other using implicit positions. According to the pandas source code in pandas/core/indexing.py, these indexers are implemented as the _LocIndexer and _iLocIndexer classes, providing the core functionality that distinguishes label-based from position-based data access.
Core Concepts: Label-Based vs Position-Based Indexing
How loc Works for Label-Based Selection
The loc indexer selects data based on the actual labels present in the index and columns. As implemented in the _LocIndexer class at line 1227 of pandas/core/indexing.py, loc interprets all indexers as labels rather than positions.
Key characteristics of loc:
- Inclusive slicing: When slicing with labels, both the start and stop boundaries are included in the result
- Label alignment: During assignment operations, the right-hand side is re-indexed to match the selected labels
- Error handling: Raises
KeyErrorwhen specified labels do not exist in the index
How iloc Works for Integer-Position Selection
The iloc indexer operates exclusively on integer positions, functioning similarly to standard Python list indexing but applied to DataFrame rows and columns. The _iLocIndexer class, defined at line 1920 of pandas/core/indexing.py, implements this position-based logic.
Key characteristics of iloc:
- Exclusive slicing: Integer slices follow Python convention where the stop index is excluded from results
- Direct assignment: Values are assigned by position without label alignment—data is taken exactly as-is
- Error handling: Raises
IndexErrorwhen positional indices exceed the bounds of the data structure
Practical Code Examples
Setting Up Sample Data
import pandas as pd
# Create DataFrame with custom index labels
df = pd.DataFrame(
{"A": [10, 20, 30], "B": [40, 50, 60]},
index=["x", "y", "z"]
)
print(df)
Selecting Data with loc
# Single label selection
value = df.loc["y", "B"] # Returns 50
# Label slicing (inclusive on both ends)
subset = df.loc["x":"y"] # Returns rows x and y
# Boolean mask using label conditions
filtered = df.loc[df["A"] > 15] # Returns rows y and z
# Assignment with label alignment
df.loc["x", "A"] = 99
Selecting Data with iloc
# Integer position selection (0-based)
value = df.iloc[1, 1] # Returns 50 (row 1, column 1)
# Position slicing (exclusive on the end)
subset = df.iloc[0:2] # Returns rows 0 and 1 (not 2)
# Boolean mask by position
mask = [True, False, True]
filtered = df.iloc[mask] # Returns rows 0 and 2
# Direct positional assignment
df.iloc[0, 0] = 77
Critical Differences in Assignment Behavior
One of the most significant distinctions between these indexers emerges during assignment operations. When using df.loc[labels] = values, pandas aligns the right-hand side to the labels on the left, potentially reordering or inserting missing values to match the label index. Conversely, df.iloc[positions] = values assigns values exactly by position without any label-based alignment, treating the data as a simple positional array.
Summary
locselects data by label (index and column names), uses inclusive slicing, raises KeyError for missing labels, and aligns assignments by labelsilocselects data by integer position (0-based), uses exclusive slicing (Python standard), raises IndexError for out-of-bounds positions, and assigns values directly without label alignment- Both indexers are implemented in
pandas/core/indexing.pyas_LocIndexer(line 1227) and_iLocIndexer(line 1920), and mixed into DataFrame and Series via_LocationIndexerMixin(line 153)
Frequently Asked Questions
Can I use integer values with loc if my DataFrame has an integer index?
Yes, but with important caveats. When your DataFrame uses an integer index (e.g., index=[10, 20, 30]), df.loc[10] selects the row where the label equals 10, not the row at position 10. If you need to select by position regardless of the index type, you must use iloc to avoid ambiguity.
Why does df.loc[1:3] include row 3 but df.iloc[1:3] excludes it?
This reflects the fundamental architectural difference between label-based and position-based indexing. loc treats slice boundaries as labels, and label slicing in pandas is inclusive of both endpoints to ensure no data is accidentally excluded when working with meaningful names like dates or categories. iloc follows standard Python zero-based indexing conventions where the stop index is exclusive, consistent with Python lists and NumPy arrays.
Which indexer provides better performance for large datasets?
Performance is generally comparable for simple selections, though iloc can be marginally faster in certain scenarios because it accesses the underlying NumPy array directly by position without label lookup overhead. However, loc provides greater robustness when working with data that may be reordered or filtered, as it references stable labels rather than volatile positions. Choose based on semantic correctness—using labels for meaningful data references and positions for order-based operations—rather than micro-optimization unless profiling reveals a specific bottleneck.
What happens if I accidentally mix labels and positions in the same selection?
You cannot mix label and position indexers within a single loc or iloc call. Attempting df.loc[0, 'column'] where 0 is a position (not a label) will raise a KeyError if 0 is not present in the index. Similarly, df.iloc['x', 0] will raise a TypeError because iloc expects integers. To combine label and position selection, chain the operations: use df.loc['x'] to select by label first, then .iloc[0] to select by position, or vice versa.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s https://instagit.com/install.md