Compressing large CSV files can save you a lot of storage space, decreasing the amount of time it takes to transfer files.
Python provides us with many different libraries that allow us to compress our CSV files with ease. In this tutorial, we will walk through the process of compressing a CSV file using the Python programming language.
Steps:
1. Importing necessary libraries:
We will be using the gzip and csv module, so import these modules using the following code snippet:
1 2 |
import gzip import csv |
2. Reading the CSV file:
Read the CSV file using the csv module by passing the filename and mode as arguments to the open()
function. After this, read the file using the csv.reader()
function.
1 2 3 4 5 6 |
filename = 'example.csv' mode = 'r' with open(filename, mode) as file: csv_file = csv.reader(file) # rest of the code goes here |
3. Creating a compressed file object:
Now, we will create a compressed file object using the gzip module with the filename extension .gz. The first argument to the gzip.open()
function is the filename, and the second argument is the mode.
1 |
compressed_file = gzip.open('example.csv.gz', 'wb') |
4. Writing compressed data:
Using the csv module, we can write the contents of the CSV file to the compressed file using the csv.writer()
function.
1 2 |
writer = csv.writer(gzip.open('example.csv.gz', 'wb')) writer.writerows(csv_file) |
5. Closing the files:
Since we are dealing with files, it is a good practice to close them after we are done using them. In our case, we will close both the compressed file and the original CSV file.
1 2 |
file.close() compressed_file.close() |
Conclusion:
In this tutorial, we learned how to compress a CSV file in Python using the gzip and csv modules.
Combining these two modules make the compression of CSV file very easy. This will help us save a lot of storage space and minimize transfer times. Below is the final output of the compressed file.
example.csv.gz
Full Code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
import gzip import csv filename = 'example.csv' mode = 'r' with open(filename, mode) as file: csv_file = csv.reader(file) compressed_file = gzip.open('example.csv.gz', 'wb') writer = csv.writer(compressed_file) writer.writerows(csv_file) file.close() compressed_file.close() |