# How to Use pandas sort by column: A Complete Guide to DataFrame.sort_values

> Learn how to efficiently sort pandas DataFrames by column using DataFrame.sort_values. Master single and multiple column sorting with ascending options and more.

- Repository: [pandas/pandas](https://github.com/pandas-dev/pandas)
- Tags: tutorial
- Published: 2026-02-16

---

**Use `DataFrame.sort_values()` to reorder rows by column values, specifying the `by` parameter for single or multiple columns, and control the sort order with `ascending`, `kind`, and `key` arguments.**

When you need to organize tabular data in the **pandas-dev/pandas** repository, the most efficient approach for a **pandas sort by column** operation is the `sort_values()` method. This high-performance function leverages optimized NumPy routines and a lightweight sorting engine to rearrange DataFrame rows without unnecessary data copying. Whether you are ranking sales figures, ordering timestamps, or prioritizing categorical labels, understanding the underlying implementation helps you write faster, more memory-efficient code.

## Understanding the pandas sort by column Implementation

The `DataFrame.sort_values` method is not merely a convenience wrapper; it is a sophisticated pipeline that delegates heavy computation to highly optimized low-level routines.

### The Core Architecture: From sort_values to safe_sort

Internally, `sort_values` is implemented as a thin wrapper around the generic **NDFrame** base class logic found in [`pandas/core/generic.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/generic.py) (lines 4868-4990). This entry point validates arguments such as `by`, `axis`, `ascending`, and `kind`, then determines the target columns by extracting them from the DataFrame’s block manager.

The concrete DataFrame-specific type signatures and overloads reside in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py) (lines 7923-7950). These ensure that when you pass a string or list of strings to the `by` parameter, pandas correctly resolves them to column positions before proceeding to value extraction.

### How safe_sort Handles the Heavy Lifting

Once columns are identified, the actual sorting logic is delegated to **`safe_sort`** in [`pandas/core/algorithms.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/algorithms.py) (lines 1431-1500). This dependency-free helper performs the following critical steps:

- **Algorithm Selection**: It uses NumPy’s `argsort` under the hood, defaulting to `quicksort` unless you specify `kind='mergesort'`, `heapsort`, or `stable`.
- **Type Handling**: For mixed-type arrays, it routes data through `_sort_mixed` or `_sort_tuples` to ensure consistent ordering.
- **NaN Management**: The `na_position` argument is applied here, determining whether missing values float to the top or sink to the bottom.
- **Key Function Application**: If you provide a vectorized `key` function (e.g., `str.lower`), it is applied to the column values before the sort permutation is calculated.

After `safe_sort` returns a permutation index, the DataFrame’s block manager applies this index via `self._mgr.take`, reordering rows efficiently without copying unnecessary data.

## Practical Examples: pandas sort by column in Action

The following examples demonstrate how to leverage `sort_values` for common data organization tasks.

### Sort by a Single Column

To perform a simple alphabetical **pandas sort by column**, pass the column name as a string to the `by` parameter:

```python
import pandas as pd

df = pd.DataFrame(
    {
        "city": ["Paris", "Berlin", "London", "Tokyo", "New York"],
        "population": [2_200_000, 3_600_000, 8_900_000, 13_900_000, 8_300_000],
        "area_km2": [105, 891, 1572, 2194, 783],
    }
)

# Sort alphabetically by city name

sorted_by_city = df.sort_values(by="city")
print(sorted_by_city)

```

```text
        city  population  area_km2
1     Berlin     3600000       891
4   New York     8300000       783
2     London     8900000      1572
0      Paris     2200000       105
3      Tokyo    13900000      2194

```

### Sort by Multiple Columns with Different Orders

For complex ranking, supply a list to `by` and a matching list to `ascending`. This example sorts by descending population, then ascending area:

```python

# Primary sort: population (high to low)

# Secondary sort: area (low to high) for ties

sorted_multi = df.sort_values(
    by=["population", "area_km2"],
    ascending=[False, True],
    kind="stable",          # Preserves original order when values are equal

)
print(sorted_multi)

```

```text
        city  population  area_km2
3      Tokyo    13900000      2194
2     London     8900000      1572
4   New York     8300000       783
1     Berlin     3600000       891
0      Paris     2200000       105

```

### Using a Custom Key Function

Apply vectorized transformations before sorting without modifying the original data. This example performs a case-insensitive sort:

```python

# Sort ignoring case sensitivity

sorted_key = df.sort_values(
    by="city",
    key=lambda s: s.str.lower()
)
print(sorted_key)

```

```text
        city  population  area_km2
1     Berlin     3600000       891
4   New York     8300000       783
2     London     8900000      1572
0      Paris     2200000       105
3      Tokyo    13900000      2194

```

### In-Place Sorting for Memory Efficiency

When working with large datasets, avoid copying data by sorting in place:

```python

# Modify the DataFrame directly, returns None

df.sort_values(by="population", inplace=True, ascending=False)
print(df)

```

```text
        city  population  area_km2
3      Tokyo    13900000      2194
2     London     8900000      1572
4   New York     8300000       783
1     Berlin     3600000       891
0      Paris     2200000       105

```

## Performance Considerations for Large DataFrames

The efficiency of **pandas sort by column** operations stems from the architecture described in the source code. Because `safe_sort` in [`pandas/core/algorithms.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/algorithms.py) is a lightweight, dependency-free routine, it minimizes overhead when processing millions of rows.

Key performance characteristics include:

- **Algorithm Selection**: Choose `kind='mergesort'` or `kind='stable'` when you need to preserve the relative order of equal elements; use `kind='quicksort'` (default) for fastest average-case performance on numeric data.
- **Memory Management**: The `inplace=True` parameter triggers `self._mgr.take` directly on the block manager, avoiding the memory overhead of creating a new DataFrame object.
- **Vectorized Keys**: Applying a `key` function operates on the entire Series via vectorized string methods (e.g., `.str.lower()`), which is significantly faster than row-wise Python loops.

## Summary

- **`DataFrame.sort_values`** is the primary method for **pandas sort by column** operations, implemented in [`pandas/core/generic.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/generic.py) and [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py).
- The actual sorting logic delegates to **`safe_sort`** in [`pandas/core/algorithms.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/algorithms.py), which uses NumPy's `argsort` and handles mixed types, NaN positioning, and stability.
- You can sort by single or multiple columns using the `by` parameter, control direction with `ascending`, and apply transformations with the `key` argument.
- For large datasets, use `inplace=True` to minimize memory usage and select appropriate algorithms (`kind`) based on stability requirements.

## Frequently Asked Questions

### How do I sort a pandas DataFrame by column values in descending order?

Pass `ascending=False` to the `sort_values` method. If sorting by multiple columns, provide a list of booleans matching the length of your `by` parameter, such as `ascending=[False, True]` to sort the first column descending and the second ascending.

### What is the difference between sort_values and sort_index in pandas?

`sort_values` rearranges rows based on the data contained within one or more columns, while `sort_index` reorders rows or columns based on their index labels (row names) or column names. Use `sort_values` for value-based ranking and `sort_index` when you need to organize data by its positional or named indices.

### How does pandas handle missing values when sorting by column?

By default, `sort_values` places NaN values at the end of the DataFrame regardless of the sort order. You can control this behavior using the `na_position` parameter, setting it to `'first'` to float missing values to the top or `'last'` to keep them at the bottom.

### Is the pandas sort_values method stable?

Yes, when you specify `kind='stable'` or `kind='mergesort'`, the sort preserves the relative order of rows that have equal values in the specified columns. This stability is implemented in the `safe_sort` function within [`pandas/core/algorithms.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/algorithms.py), which uses NumPy's stable sorting algorithms when requested.