How to Melt a Pandas DataFrame: A Complete Guide to Reshaping Data from Wide to Long Format

Use pd.melt() to transform wide-format data into long (tidy) format by specifying identifier columns with id_vars and measured variables with value_vars, creating a standardized structure for analysis.

The pandas.melt function converts DataFrames from wide format to long format, an essential transformation for data normalization and visualization preparation. According to the pandas-dev/pandas source code, the core implementation resides in pandas/core/reshape/melt.py and leverages helper utilities like _unpivot to handle the unpivoting logic efficiently. Whether you are restructuring survey data or normalizing time-series measurements, understanding how to melt a pandas dataframe enables you to create tidy datasets that work seamlessly with seaborn, ggplot, and other analytical tools.

How pandas.melt Works Internally

The melting process follows a structured four-step approach implemented in pandas/core/reshape/melt.py. First, the function identifies identifier columns (id_vars) that remain unchanged during the transformation. Second, it selects measured variables (value_vars) that will be unpivoted into rows. Third, it constructs a new DataFrame where each combination of identifier and measured variable becomes a separate row, storing variable names in a dedicated column and values in a corresponding value column. Finally, the implementation respects the DataFrame's dtype and copy semantics to prevent unintended side effects, utilizing the internal _unpivot helper to manage the actual data restructuring.

While you can access this functionality through the top-level pd.melt() function, the method is also exposed as DataFrame.melt() in pandas/core/frame.py, providing object-oriented convenience for chaining operations.

Essential Parameters for Melting DataFrames

Understanding the key parameters ensures you can handle complex reshaping scenarios effectively:

  • id_vars: Column(s) to use as identifier variables. These columns remain vertical and are repeated for each measured variable.
  • value_vars: Column(s) to unpivot. If not specified, all columns not in id_vars are melted.
  • var_name: The name for the new "variable" column that stores the former column headers (defaults to "variable").
  • value_name: The name for the new "value" column that stores the melted values (defaults to "value").
  • ignore_index: Boolean indicating whether to reset the index in the resulting DataFrame (default True).

Practical Examples of Melting DataFrames

Converting Wide Sales Data to Long Format

The most common use case involves unpivoting yearly data columns while preserving country identifiers. This example demonstrates the basic syntax using id_vars and custom naming:

import pandas as pd

# Sample wide-format DataFrame

df = pd.DataFrame({
    "country": ["US", "CA", "MX"],
    "2019_sales": [100, 150, 130],
    "2020_sales": [110, 160, 140]
})

# Basic melt: keep 'country' as identifier, unpivot the sales columns

melted = pd.melt(df,
                 id_vars=["country"],
                 var_name="year",
                 value_name="sales")
print(melted)

Output:

  country      year  sales
0      US  2019_sales    100
1      CA  2019_sales    150
2      MX  2019_sales    130
3      US  2020_sales    110
4      CA  2020_sales    160
5      MX  2020_sales    140

Selecting Specific Columns to Melt

When working with DataFrames containing extra metadata columns, explicitly define value_vars to melt only the target measurements:


# Melt with custom variable/value column names and selecting specific columns

melted_custom = pd.melt(df,
                        id_vars="country",
                        value_vars=["2019_sales", "2020_sales"],
                        var_name="year",
                        value_name="revenue")
print(melted_custom)

Output:

  country      year  revenue
0      US  2019_sales      100
1      CA  2019_sales      150
2      MX  2019_sales      130
3      US  2020_sales      110
4      CA  2020_sales      160
5      MX  2020_sales      140

Handling DataFrame Indices During Melt

When your data uses meaningful row indices, control whether they persist in the result using ignore_index. Setting this to True creates a fresh RangeIndex, while False preserves the original index values repeated across melted rows:


# Melt with ignore_index=True to reset the index in the result

df.set_index("country", inplace=True)
melted_no_index = pd.melt(df,
                          ignore_index=True,
                          var_name="year",
                          value_name="sales")
print(melted_no_index)

Output:

        year  sales
0  2019_sales    100
1  2019_sales    150
2  2019_sales    130
3  2020_sales    110
4  2020_sales    160
5  2020_sales    140

Memory Management and Type Preservation

As implemented in pandas/core/reshape/melt.py, the melt function carefully manages memory through copy semantics. When possible, the function avoids unnecessary data duplication while ensuring that modifications to the melted result do not propagate back to the original DataFrame. The implementation preserves dtype consistency across the melted value column, maintaining numeric precision and categorical classifications from the source columns.

Summary

  • pandas.melt transforms wide-format DataFrames into long (tidy) format by unpivoting columns into rows, implemented in pandas/core/reshape/melt.py.
  • Use id_vars to specify columns that should remain unchanged, and value_vars to select specific columns for melting.
  • Customize output column names using var_name and value_name to create descriptive schemas for your dataset.
  • Control index behavior with ignore_index, which defaults to True for creating clean sequential indices.
  • The underlying _unpivot helper ensures efficient data restructuring while preserving dtype integrity and managing copy semantics.

Frequently Asked Questions

What happens if I don't specify id_vars when melting a DataFrame?

If you omit id_vars, pandas melt treats all columns as measured variables and unpivots the entire DataFrame into two columns: one for the former column names and one for the values. This results in a long-format table without identifier columns to group related measurements, which is rarely useful for analysis unless you are purely stacking data.

How do I rename the variable and value columns when using pandas melt?

Pass string values to the var_name and value_name parameters to customize the default column headers. For example, var_name="year" and value_name="revenue" transform the generic "variable" and "value" columns into domain-specific labels that improve code readability and visualization compatibility.

Does pandas melt preserve the original DataFrame's index?

By default, pandas.melt resets the index in the result due to ignore_index=True, creating a new RangeIndex from 0 to N-1. If you set ignore_index=False, the function preserves the original index values, repeating them for each row created during the unpivoting process, as managed by the _unpivot utility in the source code.

What is the difference between pd.melt() and DataFrame.melt()?

pd.melt() is the module-level function that accepts the DataFrame as its first argument, while DataFrame.melt() is an instance method available on DataFrame objects defined in pandas/core/frame.py. Both invoke the same underlying implementation in pandas/core/reshape/melt.py, but the method version enables fluent chaining with other DataFrame operations.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →