In this tutorial, we will learn how to visualize a k-Nearest Neighbors (kNN) algorithm in Python.
kNN is a popular machine learning algorithm that can be used for classification and regression tasks. It works by comparing a given data point to its k nearest points in the dataset and deciding its class or value based on the majority class or average value of those neighbors, respectively.
We will use the Scikit-learn library for implementing the kNN algorithm and Matplotlib library for visualizing the results. Let’s get started!
Step 1: Import Libraries
First and foremost, we need to import the required libraries:
1 2 3 4 |
import numpy as np import matplotlib.pyplot as plt from sklearn import datasets, neighbors from matplotlib.colors import ListedColormap |
Step 2: Load Dataset
We’ll use the famous Iris dataset for this tutorial. This dataset contains 150 samples of iris flowers, with four different attributes (sepal length, sepal width, petal length, and petal width) and three different species (Setosa, Versicolor, and Virginica).
1 2 3 |
iris = datasets.load_iris() X = iris.data[:, :2] # We'll use only the first two features for simplicity and better visualization y = iris.target |
Step 3: Create kNN Classifier
Now, we’ll create the kNN classifier using the Scikit-learn library. In this step, you can choose the distance metric and the value of k (number of nearest neighbors). Here, we’re using the Euclidean distance metric and setting k to 15:
1 2 3 |
k = 15 knn = neighbors.KNeighborsClassifier(k, metric='euclidean') knn.fit(X, y) |
Step 4: Setup Visualization
Before plotting the decision boundaries, we’ll need to set up the visualization. For this, we’ll create a mesh grid using the feature data and define a color map that will be used for differentiating each class:
1 2 3 4 5 6 7 8 |
h = 0.02 # Step size in the mesh cmap_light = ListedColormap(['orange', 'cyan', 'cornflowerblue']) cmap_bold = ListedColormap(['darkorange', 'c', 'darkblue']) # Calculate the mesh grid x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1 y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) |
Step 5: Visualize Decision Boundaries
Finally, let’s visualize the decision boundaries for our kNN classifier. We’ll plot the mesh grid predictions as well as the original data points for better understanding:
1 2 3 4 5 6 7 8 9 10 |
Z = knn.predict(np.c_[xx.ravel(), yy.ravel()]) Z = Z.reshape(xx.shape) plt.figure() plt.pcolormesh(xx, yy, Z, cmap=cmap_light) plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold, edgecolor='k', s=20) plt.xlim(xx.min(), xx.max()) plt.ylim(yy.min(), yy.max()) plt.title("kNN classification (k = %i)" % k) plt.show() |
Here is the full code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
import numpy as np import matplotlib.pyplot as plt from sklearn import datasets, neighbors from matplotlib.colors import ListedColormap iris = datasets.load_iris() X = iris.data[:, :2] y = iris.target k = 15 knn = neighbors.KNeighborsClassifier(k, metric='euclidean') knn.fit(X, y) h = 0.02 cmap_light = ListedColormap(['orange', 'cyan', 'cornflowerblue']) cmap_bold = ListedColormap(['darkorange', 'c', 'darkblue']) x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1 y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) Z = knn.predict(np.c_[xx.ravel(), yy.ravel()]) Z = Z.reshape(xx.shape) plt.figure() plt.pcolormesh(xx, yy, Z, cmap=cmap_light) plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold, edgecolor='k', s=20) plt.xlim(xx.min(), xx.max()) plt.ylim(yy.min(), yy.max()) plt.title("kNN classification (k = %i)" % k) plt.show() |
Output:

Conclusion
In this tutorial, we’ve learned how to visualize the decision boundaries of the k-Nearest Neighbors algorithm using Python, Scikit-learn, and Matplotlib libraries. This visualization can help you better understand the predictions made by the kNN algorithm and can also help you fine-tune the value of k and other parameters to achieve better results.