In this tutorial, you will learn how to plot two box plots side by side in Python. Box plots are a great way to visualize your data distribution through quartiles, and being able to compare them side by side adds another layer of depth to your data analysis.
We will be using the Python library matplotlib for producing the visual representation of data and pandas for data manipulation and analysis.
Step 1: Import necessary libraries
The first step is to import the libraries we need. NumPy is used for dealing with numerical data, Pandas for data manipulation, and matplotlib for creating plots.
1 2 3 |
import numpy as np import pandas as pd import matplotlib.pyplot as plt |
Step 2: Generate some data
For the sake of this tutorial, we will generate two random datasets for comparison in our boxplot.
1 2 3 4 5 |
np.random.seed(10) data_1 = np.random.normal(100, 10, 200) data_2 = np.random.normal(90, 20, 200) data = [data_1, data_2] labels = ['data_1', 'data_2'] |
Step 3: Plot the data in box plots
Now, we have our datasets and ready to plot them side by side in box plots.
1 2 3 4 |
fig = plt.figure(figsize =(10, 5)) ax = fig.add_subplot(111) ax.boxplot(data, labels = labels) plt.show() |
This code produces a figure, adds a subplot to it, generates box plots from the data, and finally displays them.
Here is the full code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
import numpy as np import pandas as pd import matplotlib.pyplot as plt # generate random data np.random.seed(10) data_1 = np.random.normal(100, 10, 200) data_2 = np.random.normal(90, 20, 200) data = [data_1, data_2] labels = ['data_1', 'data_2'] # create figure and axis fig = plt.figure(figsize =(10, 5)) ax = fig.add_subplot(111) # creating axes instance ax.boxplot(data, labels = labels) # show plot plt.show() |
Conclusion
In this tutorial, we learned how to plot two box plots side by side in Python. This is a useful skill for anyone who wishes to visualize and compare their data in a simple and effective way. Remember to keep practicing your skills with different types of data and situations.