How To Read XLSB In Python

In this tutorial, we will learn how to read xlsb (Excel Binary Format) files in Python using the popular library pyxlsb.

Xlsb files are more efficient for storage and quicker in reading and writing in comparison to typical Excel xlsx files, but they are not supported by many third-party libraries. Pyxlsb is a very useful library to read the xlsb file format in Python. Let’s dive into the steps.

Step 1: Install the required library

First, you need to install the pyxlsb library. You can install the library using pip by running the following command:

Step 2: Read the xlsb file in Python

Once you have installed the required library, you can read an xlsb file by opening it with the help of the pyxlsb package. To read a file, follow these steps:

  1. Import the pyxlsb package.
  2. Open the xlsb file using the ‘with’ statement.
  3. Fetch and print the sheet names using the ‘get_sheet_names()’ function.

Here’s an example code:

Step 3: Read data from a specific sheet

To read data from a specific sheet, you can follow these steps:

  1. Get the sheet by its name or index using the ‘get_sheet()’ function.
  2. Iterate through the rows using the ‘rows()’ function.
  3. Access the cell value using the ‘r.Cells’ property.

Here’s an example code:

You can replace ‘Sheet1’ with your sheet name or use the index number (e.g., 1 for the first sheet, 2 for the second sheet, etc.) to access the sheet.

Step 4: Store the data in a DataFrame using pandas

Sometimes, it’s easier to manipulate the data if it’s stored in a pandas DataFrame. To store the data in a DataFrame, follow these steps:

  1. Install and import the pandas package.
  2. Create a list to store all rows from the sheet.
  3. Convert the list of rows into a pandas DataFrame.

Here’s the example code:

Here’s an example of the content of the ‘example.xlsb’ file:

The output of the code will be:

   0     1       2   3
0  1  John   Doe  30
1  2  Jane   Doe  28
2  3  Alice  Smith  25
3  4  Bob   Johnson  22 

Full Code

Remember to replace ‘example.xlsb’ in the code with the path to your xlsb file.

Conclusion

In this tutorial, we have learned how to read xlsb files in Python using the pyxlsb library. We have also shown how to access a specific sheet in the xlsb file, iterate the rows, and then store the data in a pandas DataFrame for further processing.