How to Use pandas read_excel to Load Multiple Specific Worksheets from an Excel File
Pass a list of sheet names or indices to the sheet_name parameter in pandas.read_excel to load only specific worksheets, which returns a dictionary mapping each sheet identifier to its corresponding DataFrame.
When working with large Excel workbooks in the pandas-dev/pandas repository, you often need to ingest only a subset of available data. The read_excel implementation in pandas/io/excel/_base.py provides precise control over sheet selection through the sheet_name argument, allowing you to target multiple worksheets without loading the entire file into memory.
How the sheet_name Parameter Handles Multiple Sheets
Implementation Details in pandas/io/excel/_base.py
According to the source code (lines 91-128 of pandas/io/excel/_base.py), the sheet_name argument accepts four distinct input types:
- String: Load a single worksheet by exact tab name.
- Integer: Load a single worksheet by zero-based index position.
- List-like: Load multiple specific worksheets by passing a list of names or indices.
- None: Load all worksheets (note that the default value is
0, which loads only the first sheet).
When sheet_name receives a list-like input, the function iterates over the requested identifiers, reads each sheet using the appropriate engine (such as openpyxl via pandas/io/excel/_xlsx.py or xlrd via pandas/io/excel/_xlrd.py), and constructs a dictionary where keys match your input identifiers and values are DataFrame objects.
Reading Multiple Sheets by Name
To load specific worksheets using their Excel tab names, supply a Python list of strings:
import pandas as pd
excel_path = "data/company_report.xlsx"
selected_sheets = ["Sales", "Employees"]
dfs = pd.read_excel(excel_path, sheet_name=selected_sheets)
print(type(dfs)) # <class 'dict'>
sales_df = dfs["Sales"]
employees_df = dfs["Employees"]
The dfs variable contains exactly two keys, "Sales" and "Employees", with other worksheets like "Inventory" or "Summary" remaining unread.
Reading Multiple Sheets by Index
You can also target worksheets using their zero-based integer positions:
selected_indices = [0, 2] # First and third sheets
dfs = pd.read_excel(excel_path, sheet_name=selected_indices)
first_sheet = dfs[0]
third_sheet = dfs[2]
This method is particularly effective when sheet names contain special characters, vary across files, or follow a consistent positional structure.
Applying Parameters Across Selected Sheets
As implemented in the base reader, parameters such as usecols, skiprows, dtype, and parse_dates are applied uniformly to every sheet in your selection:
dfs_advanced = pd.read_excel(
excel_path,
sheet_name=["Sales", "Inventory"],
parse_dates=["Date"],
index_col="ItemID",
usecols=["ItemID", "Date", "Revenue"]
)
sales = dfs_advanced["Sales"]
inventory = dfs_advanced["Inventory"]
Each DataFrame in the returned dictionary contains only the specified columns, with Date parsed and ItemID set as the index, demonstrating that filtering and type conversion occur per sheet before dictionary assembly completes.
Summary
- List input: Pass a list to
sheet_nameinpandas.read_excelto read multiple specific worksheets rather than the entire workbook. - Dictionary output: The function returns a dictionary mapping sheet identifiers (strings or integers) to DataFrame objects.
- Engine agnostic: The selection logic resides in
pandas/io/excel/_base.pyand functions across backends includingpandas/io/excel/_xlsx.py(openpyxl) andpandas/io/excel/_xlrd.py(xlrd). - Uniform parameters: Arguments like
usecols,skiprows, anddtypeapply to every sheet requested in the list.
Frequently Asked Questions
What is the return type when reading multiple sheets with pandas read_excel?
When sheet_name receives a list of identifiers, pandas.read_excel returns a dictionary where keys are the sheet names or integers (matching your input) and values are DataFrame objects. If you pass a single string or integer, the function returns a single DataFrame instead of a dictionary.
Can I mix sheet names and indices in the same list for sheet_name?
The sheet_name parameter accepts a list-like containing either strings or integers. While the implementation in pandas/io/excel/_base.py processes each element independently, using consistent types (all strings or all integers) ensures predictable dictionary keys in the output and clearer code maintenance.
Do parameters like usecols and skiprows apply to all selected sheets?
Yes. According to the source analysis, parameters such as usecols, skiprows, dtype, and parse_dates are applied to every sheet that is read. You cannot specify different column subsets for different sheets within a single call; instead, filter the resulting DataFrames afterward or read sheets in separate calls.
What is the difference between passing a list and passing None to sheet_name?
Passing sheet_name=None loads all worksheets in the workbook into a dictionary, while passing a specific list limits the result to only those sheets, reducing memory usage and I/O overhead. Use None when you need the complete workbook, and use a list when you require only a targeted subset.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s https://instagit.com/install.md