How to Lemmatize a List in Python

In natural language processing, Lemmatization is the algorithmic process of determining the lemma for a given word.

The purpose of this is to bring words to their base form, which is critical in many natural language processing tasks. In this tutorial, we will go through the process of lemmatizing a list of words utilizing the Python programming language.

Step 1: Installation of Required Libraries

The first essential step is to install the Natural Language Toolkit (NLTK) library for Python if it isn’t already installed. To do this, one would run the following command in the terminal.

Step 2: Importing the Necessary Libraries

Now, we need to import the nltk library into our Python code. Specifically, we need to import the WordNetLemmatizer from nltk.stem which will allow us to perform the lemmatization.

Step 3: Initializing the WordNetLemmatizer

Then, we initialize an instance of the WordNetLemmatizer class.

Step 4: Creating a List of Words

For the purposes of this tutorial, let’s make a list of words that we desire to lemmatize.

Step 5: Lemmatizing the List of Words

Now, we are going to apply the lemmatize method to each word in our list using list comprehension

Step 6: Printing the Result

Then, we print the original list of words and the lemmatized list to compare how each word has been changed to its base form.

Full Code

Output

Original Word List:
['running', 'calves', 'took', 'happily']

Lemmatized Word List:
['running', 'calf', 'took', 'happily']

Conclusion

As can be seen in the output, ‘running’, ‘took’, ‘happily’ haven’t changed, but ‘calves’ has been lemmatized into ‘calf’ which is its base form. Now, we have successfully lemmatized a list of words in Python by using the nltk library. This should assist in further text analysis or natural language processing tasks.