# How to Sort a pandas DataFrame by Two or More Columns Using sort_values()

> Learn to sort a pandas DataFrame by multiple columns using sort_values. Easily arrange your data by two or more columns with custom sort directions for efficient analysis.

- Repository: [pandas/pandas](https://github.com/pandas-dev/pandas)
- Tags: how-to-guide
- Published: 2026-02-15

---

**To sort a pandas DataFrame by multiple columns, pass a list of column names to the `by` parameter of `DataFrame.sort_values()`, optionally specifying individual sort directions via the `ascending` parameter.**

The `sort_values()` method in the **pandas-dev/pandas** repository provides a flexible interface for ordering DataFrame rows lexicographically by one or more columns. Located in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py), this implementation handles complex multi-column sorting scenarios including mixed ascending/descending orders, missing value placement, and custom sorting keys.

## Understanding DataFrame.sort_values() Syntax and Parameters

The `DataFrame.sort_values()` method signature in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py) accepts several parameters that control multi-column sorting behavior:

```python
def sort_values(
    self,
    by: IndexLabel,
    *,
    axis: Axis = 0,
    ascending: bool | list[bool] | tuple[bool, ...] = True,
    inplace: bool = False,
    kind: SortKind = "quicksort",
    na_position: str = "last",
    ignore_index: bool = False,
    key: ValueKeyFunc | None = None,
) -> DataFrame | None:

```

### The `by` Parameter and Lexicographic Sorting

The `by` parameter accepts either a single column label or a **list-like of labels** (`IndexLabel`). When you pass a list such as `["col1", "col2"]`, pandas performs a **lexicographic sort**: it first orders rows by the first column, then uses the second column to break ties, continuing through subsequent columns as needed.

This behavior is implemented in the `sort_values` method body within [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py), which delegates to the sorting engine in [`pandas/core/sorting.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/sorting.py) for the actual key-based reordering.

### Controlling Sort Direction with `ascending`

The `ascending` parameter supports **boolean values or sequences**. When sorting by multiple columns, you can pass a list or tuple of booleans where each element corresponds to the respective column in `by`. For example, `ascending=[True, False]` sorts the first column in ascending order and the second in descending order.

The implementation validates that the length of `ascending` matches the length of `by` when a sequence is provided, raising a `ValueError` if the lengths differ.

### Handling Missing Values and Stability

The `na_position` parameter controls where **NaN or None values** appear, accepting either `"first"` or `"last"` (default). When sorting multiple columns, missing values in any of the sort keys follow this positioning rule.

For algorithm stability, pandas automatically selects a **stable sort algorithm** (`mergesort` or `stable`) when sorting by multiple columns unless you explicitly specify `kind="quicksort"`. This ensures that rows with identical sort keys maintain their original relative order.

## Practical Examples for Multi-Column Sorting

The following examples demonstrate common multi-column sorting patterns using the `sort_values()` implementation in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py).

First, create a sample DataFrame with mixed data types and missing values:

```python
import pandas as pd
import numpy as np

df = pd.DataFrame({
    "col1": ["A", "A", "B", np.nan, "D", "C"],
    "col2": [2, 1, 9, 8, 7, 4],
    "col3": [0, 1, 9, 4, 2, 3],
    "col4": ["a", "B", "c", "D", "e", "F"],
})

```

### Sort by a Single Column

To sort by one column in ascending order (default):

```python
df.sort_values(by="col1")

```

### Sort by Multiple Columns Lexicographically

To sort by `col1` first, then use `col2` to break ties:

```python
df.sort_values(by=["col1", "col2"])

```

This executes a stable lexicographic sort as implemented in the pandas sorting engine.

### Sort with Mixed Ascending and Descending Orders

To sort `col1` in ascending order and `col2` in descending order:

```python
df.sort_values(by=["col1", "col2"], ascending=[True, False])

```

The implementation validates that the `ascending` list length matches the `by` list length.

### Place Missing Values First

To display rows with NaN values at the beginning of the result:

```python
df.sort_values(by="col1", na_position="first")

```

## Advanced Sorting with Custom Keys

The `key` parameter in `DataFrame.sort_values()` accepts a vectorized callable applied independently to each column in `by`, enabling complex sorting logic without modifying the original data.

### Case-Insensitive String Sorting

To sort strings ignoring case:

```python
df.sort_values(by="col4", key=lambda s: s.str.lower())

```

This applies the lowercase transformation only for the sort comparison, leaving the original casing intact in the result.

### Natural Sorting with External Libraries

For "human-friendly" sorting of alphanumeric strings (e.g., "item2" before "item10"), use the `natsort` package with the `key` parameter:

```python

# pip install natsort

from natsort import natsort_keygen

df_nat = pd.DataFrame({
    "hours": ["0hr", "128hr", "0hr", "64hr", "64hr", "128hr"],
    "mins": [5, 10, 2, 15, 1, 20]
})

df_nat.sort_values(by="hours", key=natsort_keygen())

```

## Summary

- **`DataFrame.sort_values()`** in [`pandas/core/frame.py`](https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py) is the primary method for sorting by multiple columns.
- Pass a **list of column names** to the `by` parameter to execute a **lexicographic sort** that breaks ties using subsequent columns.
- Use a **list of booleans** for `ascending` to control sort direction individually per column.
- The implementation automatically uses **stable sorting algorithms** for multi-column sorts to preserve the relative order of duplicate keys.
- Control **missing value placement** with `na_position` ("first" or "last").
- Apply **custom transformations** via the `key` parameter for case-insensitive or natural sorting without altering the underlying data.

## Frequently Asked Questions

### What is the difference between `sort_values` and `sort_index`?

`sort_values` orders rows based on the **values within columns**, specified by the `by` parameter, while `sort_index` orders rows based on the **DataFrame's index labels**. Use `sort_values` when you need to sort by data content, and `sort_index` when you need to reorder by row labels or index levels.

### How do I sort by multiple columns with different ascending orders?

Pass a **list of boolean values** to the `ascending` parameter where each boolean corresponds to the respective column in the `by` list. For example, `df.sort_values(by=["A", "B"], ascending=[True, False])` sorts column "A" in ascending order and column "B" in descending order. The lengths of both lists must match, or pandas raises a `ValueError`.

### Does `sort_values` modify the original DataFrame?

By default, `sort_values` returns a **new sorted DataFrame** and leaves the original unchanged. Set `inplace=True` to modify the original DataFrame in-place, which returns `None`. The default behavior (`inplace=False`) is recommended for method chaining and functional programming patterns.

### How are NaN values handled when sorting multiple columns?

Missing values (NaN, None, or NaT) are placed **after** all valid values by default (`na_position="last"`). You can change this to `na_position="first"` to place missing values at the beginning of the result. When sorting by multiple columns, the `na_position` applies to all sort keys consistently.