How To Read CSV Files From Multiple Folders In Python

When working with data in Python, one common scenario is to read data from multiple CSV (Comma-Separated Values) files located in multiple folders.

Reading and analyzing data from multiple CSV files becomes essential when you are working on projects that involve data analysis, data science, or machine learning.

This tutorial will go through a step-by-step guide on how to read CSV files from multiple folders in Python using the pandas library and the os module in python.

Step 1: Install the pandas library

Before we start reading CSV files, let’s make sure that we have the necessary pandas library installed on our system. You can install pandas using pip with the following command:
pip install pandas

Step 2: Import the necessary Libraries

In this step, we will import the necessary Python libraries:

Step 3: Identify the folders containing the CSV files

Now we need to find the folders where the CSV files are located. For this tutorial, let’s assume that we have a folder named data and inside that folder, we have two folders named folder1 and folder2, and each folder contains multiple CSV files.

Example folder structure:

Step 4: Get a list of all folders

In this step, we will get a list of all folders in the data folder using the os.listdir() function.

Step 5: Iterate through the folders and read CSV files

In this step, we will create a function to read a single CSV file, then iterate through the list of folders, read all the CSV files in each folder, and append them into a single pandas DataFrame.

With this we have our final dataframe all_data containing all the data from multiple CSV files located in multiple folders.

Full Code:

Conclusion

In this tutorial, we’ve learned how to read CSV files from multiple folders in Python using the pandas library and the os module. This method is quite helpful when you are working with large datasets or when your data is spread across different folders in your project.