How to Merge Cells in a CSV File Using Python

Merging cells in a csv file can seem like a really complex task especially when dealing with large datasets. However, with Python and its powerful libraries, it can be a simple and straightforward process. This tutorial will guide you on how to merge cells in a csv file using Python.

Creating a file

For this example, we have to create a file called “file.csv”:

Column1,Column2,Column3
A,1,Apple
A,1,Banana
A,2,Orange
B,1,Grape
B,2,Apple
C,1,Pineapple
C,2,Watermelon

Installing Pandas

Before we start, ensure that you have Python installed on your machine. If not, you can download it from the official Python website. You will also need to install the ‘pandas’ library, which is a powerful tool for data manipulation in Python. You can install it by typing the following command in your terminal:

Importing Necessary Libraries

Once you have all the necessary tools installed, the first step is to import the pandas library.

Here is how:

Loading the CSV file

After importing pandas, you need to load the csv file that you plan to merge its cells. To load the csv file, we use pd.read_csv(‘file.csv’).

Merging the Cells

Now that we have loaded our CSV file into a pandas dataframe, we can now merge cells in the CSV file.

We do this using pandas groupby and agg functions. Below is an example:

In the example above, cells in the same row under ‘Column1’ and ‘Column2’ are merged together with a comma separating the values.

Saving the Resulting CSV

After merging the cells, it is now time to save our new transformed CSV file.

The Full Code

Output:

“Banana” and “Apple” are merged into a single row.

Column1,Column2,Column3
A,1,"Banana, Apple"
A,2,Orange
B,1,Grape
B,2,Apple
C,1,Pineapple
C,2,Watermelon