How To Draw Regression Line On Scatter Plot Python

In this tutorial, we will learn how to draw a regression line on a scatter plot in Python. A regression line is a line that represents the relationship between two variables. In our case, we’ll be using a scatter plot to visualize the relationship between two variables, and we’ll draw a regression line that best fits the data points.

We’ll make use of Python’s popular data visualization library, matplotlib, and a widely-used statistical library called SciPy for this tutorial. If you don’t have them already, you can install them using pip:

After installing the required libraries, let’s begin our tutorial.

Step 1: Import Libraries

First, let’s import the necessary libraries:

Step 2: Create Sample Data

Next, we’ll create some sample data we can use for our scatter plot and regression line. We’ll do this using NumPy, which is a powerful library for numerical and mathematical operations in Python. In this step, we’ll generate some random data points for our two variables, x and y.

Step 3: Calculate Regression Line Coefficients

Now that we have our x and y variables, we can calculate the slope and intercept of our regression line. We’ll use the stats.linregress() function from the SciPy library to do this. This function returns five values: the slope, intercept, the correlation coefficient, the p-value, and the standard error. We only need the slope and intercept for drawing the regression line.

With the calculated slope and intercept, our regression line will have the equation:
y = slope * x + intercept

Step 4: Draw Scatter Plot and Regression Line

Finally, we’ll plot our data points and the regression line we’ve calculated. We’ll use the plt.plot() function for drawing the regression line and plt.scatter() function for the scatter plot.

This code will generate a visualization with blue data points representing our data and a red regression line. The labels and legend provide context for the plot.

Full Code:

Output:

The output will display a scatter plot with a red regression line representing the relationship between the x and y variables.

Conclusion

In this tutorial, we have learned how to draw a regression line on a scatter plot in Python using the matplotlib and SciPy libraries. This is a useful technique for visualizing the relationship between two variables, and it can be easily extended to more complex data and models.

Practice the concepts discussed in this tutorial using different datasets to improve your understanding of drawing regression lines on scatter plots.