How To Read Tar File In Python

In this tutorial, we will learn how to read .tar files in Python. These files are commonly used for creating compressed archives like the ones created with the GNU tar program. Reading tar files in Python can be achieved using the tarfile module, which is included with the language standard library.

Step 1: Importing the required module

To begin, import the tarfile module, which provides functionalities to read tar files. The following line of code imports the tarfile module:

Step 2: Opening the tar file

We will use the open() function from the tarfile module to open the tar file. The syntax for opening a tar file is as follows:

Here, the name is the name of the tar file, and the mode r is for reading. If the tar file is compressed, use r:gz for gzip-compressed files or r:bz2 for bz2-compressed files. We can also read a tar file directly from a file-like object using fileobj. The following code shows how to open a tar file:

Step 3: Listing the content of the tar file

We can list the contents of the tar file using the getnames() method. This method returns a list of filenames within the tar file. The following code demonstrates how to list the contents of the tar file:

Step 4: Extracting files from the tar file

To extract files from the tar file, we can use the extractall() method. This method extracts all files to the specified output directory. If the directory is not specified, the files will be extracted to the current working directory. The following code shows how to extract files from the tar file:

Alternatively, we can extract a specific file from the tar file using the extract() method. This method takes a TarInfo object, which we can obtain using the getmember() method. The following code demonstrates how to extract a specific file:

Step 5: Closing the tar file

After we have finished reading or extracting files from the tar file, we should close it using the close() method. This ensures that resources are properly released. The following code shows how to close a tar file:

Full Code

Here is the complete code for reading and extracting files from a tar file:

Output

['sample.txt', 'folder/', 'folder/file1.txt']

In this tutorial, we learned how to read tar files in Python using the tarfile module. We demonstrated how to open a tar file, list its contents, extract files, and close the tar file. Knowing how to work with tar files in Python can be a valuable skill for managing archived files and directories.