How to Use the K-S Test in Python

The Kolmogorov-Smirnov Test (K-S Test) is a statistical test that can be used to compare a sample with a reference probability distribution (one-sample K-S test) or to compare two samples (two-sample K-S test).

In this tutorial, we will guide you through the process of how to perform the K-S Test in Python, specifically using the SciPy library.

Step 1: Installing the Required Libraries

For this guide, you’ll mainly require SciPy and Numpy libraries. If not already installed in your Python environment, install them with the following command:

Step 2: Importing the Libraries

We’ll begin by importing the necessary libraries and modules:

Step 3: Generating Data

Next up, we need to generate sample data for our K-S test. We’ll go ahead and create this data using Numpy:

Step 4: Performing the K-S Test

The Scipy library’s kstest function allows us to perform a Kolmogorov-Smirnov test for goodness of fit. Thus comparing our sample dataset (rvs1) to a normal distribution:

This will provide a test statistic and a p-value. The P-value indicates the probability of an unobserved result. If your p-value is below 0.05, we reject the null hypothesis that the sample comes from a normal distribution.

Step 5: Comparing Two Samples

We can also perform a two-sample K-S Test to test if the distribution of two independent samples is the same:

Full Code

Output

K-S test result is: KstestResult(statistic=0.0691059988535429, pvalue=0.2819866816254518, statistic_location=-0.24273331471460374, statistic_sign=-1)
K-S test result for two samples is: KstestResult(statistic=0.20833333333333334, pvalue=5.1292795977908046e-05, statistic_location=1.079426875125683, statistic_sign=1)

Conclusion

Python, with its wide range of statistical libraries and functions, allows for a concise way to perform complex statistical tests, such as the Kolmogorov-Smirnov test. Utilizing the SciPy library, we’ve detailed how to set up and perform the one-sample and two-sample K-S tests. This should provide a solid foundation for your explorations of statistical testing in Python.