In most cases, CSV data files come with headers. However, there might be scenarios where you’d want to remove these headers for analysis purposes or before feeding the data into machine learning models. Fortunately, with Python and its powerful library, Pandas, removing CSV headers is straightforward and easy.
Before we begin
First, make sure you have Python installed on your machine. Secondly, make sure that you have installed the Pandas library as well. To install Pandas, use the pip install command in your command prompt:
1 |
pip install pandas |
Sample CSV File
For this tutorial, we’ll use a sample data.csv file which looks like this:
Name, Age, Occupation John, 23, Engineer Alice, 24, Doctor Smith, 30, Teacher
Step 1: Import the Pandas module
1 |
import pandas as pd |
Step 2: Load CSV file into DataFrame
To load a CSV file into a DataFrame, use pandas.read_csv() function. If your file has a header, it will automatically become the DataFrame columns.
1 2 |
df = pd.read_csv('data.csv') print(df) |
The output of this operation would be:
Name, Age, Occupation John, 23, Engineer Alice, 24, Doctor Smith, 30, Teacher
Step 3: Remove the headers
If you want to remove the header, you can skip the first row (header row) when loading the CSV file by setting the skiprows parameter to 1 in the read_csv() function.
1 2 |
df_without_header = pd.read_csv('data.csv', skiprows = 1) print(df_without_header) |
The output of the above code would be:
John, 23, Engineer Alice, 24, Doctor Smith, 30, Teacher
Full Code:
1 2 3 4 5 6 7 |
import pandas as pd df = pd.read_csv('data.csv') print(df) df_without_header = pd.read_csv('data.csv', skiprows = 1) print(df_without_header) |
Conclusion
As you can see, removing headers from a CSV file using the pandas library in Python is relatively straightforward. It’s just a matter of adding the ‘skiprows’ parameter in the read_csv() function. The advantage of using Pandas is that it allows you to do a lot more with the data like cleaning, transforming, and analyzing.