In this tutorial, we will learn how to read .tar files in Python. These files are commonly used for creating compressed archives like the ones created with the GNU tar program. Reading tar files in Python can be achieved using the tarfile module, which is included with the language standard library.
Step 1: Importing the required module
To begin, import the tarfile module, which provides functionalities to read tar files. The following line of code imports the tarfile
module:
1 |
import tarfile |
Step 2: Opening the tar file
We will use the open() function from the tarfile
module to open the tar file. The syntax for opening a tar file is as follows:
1 |
tarfile.open(name, mode='r', fileobj=None, bufsize=10240, **kwargs) |
Here, the name
is the name of the tar file, and the mode r
is for reading. If the tar file is compressed, use r:gz
for gzip-compressed files or r:bz2
for bz2-compressed files. We can also read a tar file directly from a file-like object using fileobj
. The following code shows how to open a tar file:
1 |
tar = tarfile.open("example.tar") |
Step 3: Listing the content of the tar file
We can list the contents of the tar file using the getnames() method. This method returns a list of filenames within the tar file. The following code demonstrates how to list the contents of the tar file:
1 2 |
file_list = tar.getnames() print(file_list) |
Step 4: Extracting files from the tar file
To extract files from the tar file, we can use the extractall() method. This method extracts all files to the specified output directory. If the directory is not specified, the files will be extracted to the current working directory. The following code shows how to extract files from the tar file:
1 |
tar.extractall("./output") |
Alternatively, we can extract a specific file from the tar file using the extract() method. This method takes a TarInfo
object, which we can obtain using the getmember() method. The following code demonstrates how to extract a specific file:
1 2 |
file = tar.getmember("sample.txt") tar.extract(file, "./output") |
Step 5: Closing the tar file
After we have finished reading or extracting files from the tar file, we should close it using the close() method. This ensures that resources are properly released. The following code shows how to close a tar file:
1 |
tar.close() |
Full Code
Here is the complete code for reading and extracting files from a tar file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
import tarfile # Open the tar file tar = tarfile.open("example.tar") # List the contents of the tar file file_list = tar.getnames() print(file_list) # Extract all files to the output directory tar.extractall("./output") # Extract a specific file file = tar.getmember("sample.txt") tar.extract(file, "./output") # Close the tar file tar.close() |
Output
['sample.txt', 'folder/', 'folder/file1.txt']
In this tutorial, we learned how to read tar files in Python using the tarfile
module. We demonstrated how to open a tar file, list its contents, extract files, and close the tar file. Knowing how to work with tar files in Python can be a valuable skill for managing archived files and directories.