It is easy to get lost in the world of correlations!
But make no mistake, you have valuable information about your data by looking at the correlations between the variables.
And if you’re planning to build a linear model using supervised machine learning, the presence of correlated variables will exacerbate the model’s accuracy.
Hence, looking at the correlations between continuous variables is almost a non-negotiable task, and one of the ways to do it is by calculating the Pearson correlation coefficient for these variables.
Pearson coefficient is the default mode for many, including me, for testing correlations.
But Pearson coefficient has found its kryptonite in the form of the non-linear relationship between the variables.
Pearson coefficient makes sense only when there is a linear relationship between the variables and is not very useful when the variables have a non-linear relationship.
But it is high time we adopt other measures of correlation in addition to the Pearson coefficient to ensure we are not missing out on valuable correlations in the data.