In this tutorial, we will learn how to convert Excel files from .xls format to .xlsx format using Python.
This can be useful when you have older Excel files that need to be updated or when you need to work with files with different formats in a consistent manner. We will use the openpyxl and xlrd libraries to achieve this conversion.
Step 1: Install Required Libraries
Firstly, let’s install the required libraries for this conversion. You can install openpyxl and xlrd using pip by running the following commands:
1 2 |
pip install openpyxl pip install xlrd |
Make sure you have these libraries successfully installed before moving forward.
Step 2: Import Required Libraries
Now, let’s import the required libraries into our Python script:
1 2 3 |
import os from openpyxl import Workbook import xlrd |
Step 3: Open the XLS File
We will now open the .xls file using the xlrd library. First, let’s specify the path to the .xls file:
1 |
xls_file = 'sample.xls' |
Next, we will open the .xls file using the xlrd.open_workbook
function:
1 |
xls_workbook = xlrd.open_workbook(xls_file) |
Now that we have opened the .xls file, we can proceed to create a new .xlsx file that will contain the data from the .xls file.
Step 4: Convert XLS to XLSX
To convert the data from the .xls file to the .xlsx file, we’ll create a new openpyxl Workbook object and copy the contents of the .xls file into it:
1 2 3 4 5 6 7 8 9 10 11 12 |
xlsx_workbook = Workbook() # Iterate through sheet names for sheet_name in xls_workbook.sheet_names(): xls_sheet = xls_workbook.sheet_by_name(sheet_name) xlsx_sheet = xlsx_workbook.active # Iterate through rows and columns of the xls sheet for row in range(xls_sheet.nrows): for col in range(xls_sheet.ncols): cell_value = xls_sheet.cell_value(row, col) xlsx_sheet.cell(row=row + 1, column=col + 1, value=cell_value) |
Here, we are iterating through all sheets in the .xls file, and for each sheet, we are iterating through all rows and columns to copy cell values from the .xls file to the .xlsx file.
Step 5: Save the XLSX File
Now that all the data from the .xls file has been copied to the .xlsx file, we can save the .xlsx file with the same name but with the .xlsx extension:
1 2 |
xlsx_file = os.path.splitext(xls_file)[0] + '.xlsx' xlsx_workbook.save(xlsx_file) |
Your .xls file should now be successfully converted into .xlsx format.
Full Code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
import os from openpyxl import Workbook import xlrd xls_file = 'sample.xls' xls_workbook = xlrd.open_workbook(xls_file) xlsx_workbook = Workbook() for sheet_name in xls_workbook.sheet_names(): xls_sheet = xls_workbook.sheet_by_name(sheet_name) xlsx_sheet = xlsx_workbook.active for row in range(xls_sheet.nrows): for col in range(xls_sheet.ncols): cell_value = xls_sheet.cell_value(row, col) xlsx_sheet.cell(row=row + 1, column=col + 1, value=cell_value) xlsx_file = os.path.splitext(xls_file)[0] + '.xlsx' xlsx_workbook.save(xlsx_file) |
Conclusion
In this tutorial, we learned how to convert Excel files from .xls to .xlsx format using the openpyxl and xlrd libraries in Python. This method can be helpful when working with Excel files of different formats and updating older files to a more recent format. Happy converting!