How to Generate Correlated Random Variables in Python

In statistics and data analysis, often there is a need to generate correlated random variables that follow a specific distribution. This is a common situation in Monte Carlo simulations, computational statistics, and numerical methods in general. In this tutorial, you will learn how to generate correlated random variables in Python using the numpy library.

Step 1: Import Necessary Libraries

First and foremost, we shall import the necessary modules for our operations. In our case, we will require the Numpy and Scipy libraries. These libraries are fundamental for scientific computing in Python. You can install them via pip:

Then import them in our script:

Step 2: Define Variables

For this tutorial, we assume our task is to generate two standard normally distributed random variables with a given correlation coefficient (ρ) which in our case will be 0.6.

Step 3: Generate Uncorrelated Random Variables

Here, we generate two sets of uncorrelated standard normally distributed random variables and store them in a two-dimensional array.

Step 4: Cholesky Decomposition

We then use Cholesky decomposition, which is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose. We use this decomposition to transform our original uncorrelated variables, with the goal of influencing them to have our desired correlation.

Your complete Python code is:

Output

Correlation coefficient of original data:  [[ 1.         -0.00479441]
 [-0.00479441  1.        ]]
Correlation coefficient of transformed data:  [[1.         0.60020262]
 [0.60020262 1.        ]]

Conclusion

In summary, generating correlated random variables in Python is fairly simple and can be accomplished with just a few lines of code by using the Numpy and Scipy libraries.

Knowledge of generating correlated variables is essential, especially for Monte Carlo simulations, uncertainty analysis, and machine learning problems where variables often exhibit interdependence.