Correlations can be used in Hypothesis Testing just like the Arithmetic Mean. For the Pearson Correlation we can use a t-Test (as explained in Section 71.4) while the Spearman Rank Order Correlation can be tested with a t-statistic (with a t-Distribution) or z-statistic (using the Normal Distribution) as described in Section 72.4 and Section 72.5.
Similarly, Hypothesis Testing can be used for (almost) any type of correlation, including Kendall’s \(\tau\) Rank Correlations, Partial Pearson Correlations, and Autocorrelations.
131.1 Hypotheses
In most practical cases, Hypothesis Tests formulated about correlations are two-sided with a Null value of zero:
131.2 Analysis based on p-values and confidence intervals
Let us reconsider the example from the Pearson Correlation coefficient between US retail prices and Arabica import prices. The p-value \(p \simeq 0\) is shown in the lower left panel of the Pairwise Scatterplots and leads us to reject the Null Hypothesis. If rank correlation (Spearman or Kendall) is selected instead of Pearson, the p-value will be automatically recomputed (in this case it doesn’t make any difference because the p-values are extremely small).
The Pearson Correlation coefficient can be computed and used as a purely descriptive tool (even when using two binary variables). However, when we wish to use Pearson correlation in the context of Hypothesis Testing, the observations should be independent and the relationship should be approximately linear. For small samples, the usual tests rely on stronger distributional assumptions (typically approximate bivariate normality).
For interval estimation and normal approximation, the Fisher-\(z\) transform (Fisher 1915) is preferred:
\[
z = \frac{1}{2}\ln\left(\frac{1+r}{1-r}\right) \approx \text{N}\left(\frac{1}{2}\ln\left(\frac{1+\rho}{1-\rho}\right), \frac{1}{n-3}\right)
\]
The Pearson Correlation coefficient can only describe linear relationships, whereas rank correlations, e.g. Spearman’s Rank Order correlation and Kendall’s \(\tau\), are estimates of any monotonic form.
Pearson's product-moment correlation
data: cars$speed and cars$dist
t = 9.464, df = 48, p-value = 1.49e-12
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.6816422 0.8862036
sample estimates:
cor
0.8068949
cor cor
0.6816422 0.8862036
Fisher, Ronald A. 1915. “Frequency Distribution of the Values of the Correlation Coefficient in Samples from an Indefinitely Large Population.”Biometrika 10 (4): 507–21. https://doi.org/10.1093/biomet/10.4.507.