# How to Get Columns from a pandas DataFrame: Extract Header Lists

> Easily get columns from a pandas DataFrame using df.columns.tolist() to extract a list of all header strings. Learn this quick method for header extraction.

- Repository: [pandas/pandas](https://github.com/pandas-dev/pandas)
- Tags: how-to-guide
- Published: 2026-02-16

---

**Use `df.columns.tolist()` to convert a DataFrame's column Index into a plain Python list of header strings.**

The `pandas-dev/pandas` library structures DataFrame metadata through an inherited axis system where column labels are stored as a specialized Index object. When you need to **pandas get columns** as a standard list for validation, API calls, or iteration, you must extract and convert this Index using specific accessors defined in the core source files.

## Accessing Column Headers via the Info Axis

In pandas source architecture, `DataFrame` inherits from the `NDFrame` base class defined in [`pandas/core/generic.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/generic.py). This base class implements an "info axis" pattern that manages structural labels—columns for DataFrames and the index for Series.

The `DataFrame` class in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py) explicitly sets the class variable `_info_axis_name = "columns"` (line 18462). The generic `_info_axis` property in [`pandas/core/generic.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/generic.py) (lines 603-606) returns `getattr(self, self._info_axis_name)`, which means `df.columns` is essentially a public alias for this internal info axis.

Because `df.columns` returns an `Index` object (from [`pandas/core/indexes/base.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/indexes/base.py)) rather than a simple list, it carries pandas-specific indexing capabilities while requiring conversion for standard Python operations.

## Converting the Column Index to a List

While `df.columns` provides the header labels, it returns them as an Index object that displays as `Index(['A', 'B'], dtype='object')`. To extract a native Python list, call the `tolist()` method implemented in [`pandas/core/indexes/base.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/indexes/base.py).

This method efficiently converts the underlying array data into a standard `list` of strings. The `axes` property in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py) (lines 788-889) confirms that `df.columns` returns the column Index, distinct from `df.index` which returns the row index.

## Practical Code Examples

### Basic DataFrame Column Extraction

```python
import pandas as pd

df = pd.DataFrame({
    "city": ["Paris", "Berlin", "Tokyo"],
    "population": [2.1, 3.6, 9.3],
    "country": ["France", "Germany", "Japan"]
})

# Get list of column headers

headers = df.columns.tolist()
print(headers)

# Output: ['city', 'population', 'country']

```

### Reading CSV Files and Listing Columns

```python
import pandas as pd

# Load data from CSV

df = pd.read_csv("sales.csv")

# Extract column names immediately after loading

column_headers = df.columns.tolist()
print(f"Available columns: {column_headers}")

```

### Accessing the Underlying Info Axis (Advanced)

For introspection or debugging, you can access the internal `_info_axis` attribute directly, though `df.columns` remains the public API:

```python
import pandas as pd

df = pd.DataFrame({"x": [0], "y": [1], "z": [2]})

# Access internal info axis (returns same Index as df.columns)

info_axis = df._info_axis
print(info_axis.tolist())

# Output: ['x', 'y', 'z']

```

## How Column Storage Works in pandas Source Code

The behavior of `df.columns.tolist()` is determined by three key files in the pandas repository:

- **[`pandas/core/generic.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/generic.py)**: Defines the `NDFrame` base class containing the `_info_axis` property (lines 603-606) that dynamically retrieves the axis specified by `_info_axis_name`.
- **[`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py)**: The `DataFrame` implementation sets `_info_axis_name = "columns"` and exposes the column Index through the `axes` property (lines 788-889), which returns `[self.index, self.columns]`.
- **[`pandas/core/indexes/base.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/indexes/base.py)**: Implements the `Index` base class with the `tolist()` method that converts index labels to plain Python lists.

## Summary

- **`df.columns`** accesses the column Index object stored as the DataFrame's info axis.
- **`df.columns.tolist()`** converts the Index to a standard Python list of header strings.
- The column storage mechanism inherits from `NDFrame` in [`pandas/core/generic.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/generic.py), with DataFrame-specific configuration in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py).
- The `tolist()` method is defined in [`pandas/core/indexes/base.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/indexes/base.py) and works on any Index instance, including MultiIndex columns.

## Frequently Asked Questions

### What is the difference between `df.columns` and `df.columns.tolist()`?

`df.columns` returns an Index object (a pandas array with metadata), while `df.columns.tolist()` returns a plain Python list containing only the label values. Use the former for pandas operations like label-based selection, and the latter when you need a native list for Python standard library functions or external APIs.

### Can I get column headers as a list without using `tolist()`?

Yes, you can use `list(df.columns)`, which iterates through the Index, but `df.columns.tolist()` is the idiomatic approach explicitly implemented in [`pandas/core/indexes/base.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/indexes/base.py) for optimal performance. The `tolist()` method directly accesses the underlying ndarray data, making it slightly faster than the generic `list()` constructor.

### Why does `df.columns` return an Index instead of a list?

pandas uses Index objects because they support label-based alignment, boolean indexing, and hierarchical operations (MultiIndex) that plain lists cannot provide. This design choice, implemented in the `NDFrame` architecture of [`pandas/core/generic.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/generic.py), allows column headers to participate in data alignment during joins, merges, and concatenation operations.

### How do I get column names when reading from a CSV file?

After calling `pd.read_csv()`, immediately call `df.columns.tolist()` on the returned DataFrame. The CSV parser extracts headers from the first row (or specified header row) and stores them in the column Index before returning the DataFrame, making the headers available instantly without additional processing.