How to Read Excel Files by Pandas read_excel

Reading Excel files in Pandas – A Python library

Pandas is a Python library that is used:

  • For data science
  • Data analysis
  • Analyzing data
  • Cleaning data
  • Exploring and manipulating data

If we are talking about data, how come Excel is not covered?

Pandas has a method read_excel() that enables us to read the Excel files (xlsxlsxxlsmxlsbodfods and odt file extensions).

The Pandas read_excel() has plenty of parameters that you may pass to fetch the data as per need.

You may load:

  • Whole sheet data
  • Multiple sheets data
  • Merge data
  • Exclude a certain number of rows from the sheet(s)
  • Fetch only specific top rows
  • Return only a few bottom row
  • And many more

In this tutorial, we will show you examples of loading and reading Excel files with a few options (one by one), so keep reading.

An example of loading and displaying Excel sheet

For our examples, we will use the following Excel sheet with the .xlsx extension.

Pandas_sample-file

It has two sheets namely:

  • Product Information
  • Sales

The first example reads data from the “Product Information” sheet:

Note: If you have not installed the pandas package yet, just write this line in the CMD (or command line)

$ pip install pandas

Python code to read Excel file:

Output:

Two things to notice in the above output:

  • By default, read_excel only loaded the first sheet’s data.
  • It also added row numbers (0,1,2…) which is index column

So, we will address both in the examples below.

How to omit index column from the data frame

The read_excel has a parameter index_col that you may use to omit the first column that contains the row number.

Its default value is None i.e. index_col=None

By using index_col=0 we get the output as follows:

Code:

Output:

Pandas _read_excel

Using the values example

Alternatively, you may get the Sheet rows without index columns as follows:

Code:

Output:

Pandas read_values

Reading Excel file by sheet names – sheet_name parameter

The following example specifies the sheet in our Excel Workbook. We will fetch records of the “Sales” sheet by number.

For that, sheet_name parameter is used where you may specify the sheet number from 0. In our case,

  • 0 = Product Information
  • 1 = Sales

The program below loads data from Sheet 1 i.e. Sales.

Python code:

Output:

read_excel-Sheet-name

Loading data by sheet name rather than number

Similarly, you may specify the sheet name rather number of sheet in the Workbook.

The example below loads data of “Product Information” by using sheet_name and omitting the index column:

Result:

read_excel-sheet-name-

Loading two (multiple) sheets example

Result:

read_excel-multipe-she

Display only column headers

Though this will be deprecated in the next version, let us show you how to get headers only from the sheet:

Output:

Get only specific columns example

Output:

read_excel-columns