In this tutorial, we will learn how to extract date information from a filename using the Python programming language. This can be useful for organizing, sorting, or filtering files based on the embedded date information.
Step 1: Import Necessary Libraries
First of all, import the necessary libraries in Python, namely os, re, and datetime. The os (operating system) library allows us to work with the file system, while re (regular expressions) enables searching for patterns in strings. The datetime module offers classes and methods for working with dates and times.
1 2 3 |
import os import re import datetime |
Step 2: Define a Regular Expression Pattern for the Date
Next, define a regular expression pattern that matches the date format present in your filenames. For instance, let’s consider the following filename pattern:
MyFile_2022-03-14.txt
The date format in this filename is YYYY-MM-DD. We’ll create a regular expression pattern as shown below:
1 |
date_pattern = re.compile(r"\d{4}-\d{2}-\d{2}") |
This pattern will match any characters that follow the format of four digits followed by a hyphen, two digits, another hyphen, and two more digits.
Step 3: List Files and Extract Dates
Now that we have a pattern to look for, let’s loop through all the files in a given directory and extract dates from filenames.
1 2 3 4 5 6 7 |
directory = "path/to/your/directory" # Replace this with your directory path for filename in os.listdir(directory): date_match = date_pattern.search(filename) if date_match: date_string = datetime.datetime.strptime(date_match.group(), "%Y-%m-%d") print(f"Filename: {filename} - Date: {date_string}") |
This code will list all the files in the specified directory and search for a match in each filename. If it finds a match, it extracts the date string and converts it into a datetime object using the strptime() method from the datetime module.
Full Code
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import os import re import datetime date_pattern = re.compile(r"\d{4}-\d{2}-\d{2}") directory = "path/to/your/directory" # Replace this with your directory path for filename in os.listdir(directory): date_match = date_pattern.search(filename) if date_match: date_string = datetime.datetime.strptime(date_match.group(), "%Y-%m-%d") print(f"Filename: {filename} - Date: {date_string}") |
Output
Filename: MyFile_2022-03-14.txt - Date: 2022-03-14 00:00:00 Filename: AnotherFile_2021-12-25.txt - Date: 2021-12-25 00:00:00
Conclusion
In this tutorial, we demonstrated how to extract date information from filenames in Python using the os, re, and datetime libraries. This technique can be adapted to various date formats and is useful for managing files with embedded date information.