78 Probability Plot Correlation Coefficient Plot (PPCC Plot)

78.1 Definition

The PPCC Plot (Filliben 1975) is generated by an iterative process:

a parameter which defines the shape of a user-specified distribution is set to an initial value
the Probability Plot (i.e. QQ Plot against the specified distribution) is computed for the current value of the shape parameter
the Pearson Correlation coefficient (of the Probability Plot) is computed and stored together with the value of the shape parameter
steps 2 and 3 are repeated until a (pre-specified) final value is reached for the shape parameter
a plot is generated which shows all Pearson Correlation coefficients against their respective shape parameter values

78.2 Horizontal axis

The horizontal axis displays the values of the shape parameter which varied between two pre-specified (minimum and maximum) values.

78.3 Vertical axis

The vertical axis displays the Pearson Correlation coefficients.

78.4 R Module

78.4.1 Public website

The Tukey-Lambda PPCC Plot can be found on the public website:

https://compute.wessa.net/rwasp_tukeylambda.wasp

78.4.2 RFC

The Tukey-Lambda PPCC Plot is available in RFC under the “Distributions / Tukey lambda PPCC Plot”.

To compute the Tukey-Lambda PPCC Plot on your local machine, the following script can be used in the R console:

x <- rnorm(500) #should result in lambda = 0.14
#x <- runif(500) #should result in lambda = 1
gp <- function(lambda, p) {
  (p^lambda-(1-p)^lambda)/lambda
}
sortx <- sort(x)
c <- array(NA,dim=c(201))
for (i in 1:201) {
  if (i != 101) c[i] <- cor(gp(ppoints(x), lambda=(i-101)/100),sortx)
}
plot((-100:100)/100,c[1:201],xlab='lambda',ylab='correlation',main='PPCC Plot - Tukey lambda')
grid()

print('Tukey Lambda - Key Values')
cat(paste('\tDistribution (lambda)', 'Correlation\n',
'Approx. Cauchy (lambda=-1)', c[1], '\n',
'Exact Logistic (lambda=0)', (c[100]+c[102])/2, '\n',
'Approx. Normal (lambda=0.14)', c[115], '\n',
'U-shaped Dist. (lambda=0.5)', c[151], '\n',
'Exactly Uniform (lambda=1)', c[201], '\n', sep = '\t'))

[1] "Tukey Lambda - Key Values"
    Distribution (lambda)   Correlation
    Approx. Cauchy (lambda=-1)  0.428325450389276   
    Exact Logistic (lambda=0)   0.994896246857765   
    Approx. Normal (lambda=0.14)    0.998464173639482   
    U-shaped Dist. (lambda=0.5) 0.988786596424759   
    Exactly Uniform (lambda=1)  0.975574929158612

To compute the Tukey-Lambda PPCC Plot, the R code uses a custom-made function called gp that computes the lambda function. Furthermore, a loop iterates over lambda values between -1 and 1 with a stepsize of 0.01. If a Uniform distribution is used instead of a Normal distribution, the optimal value of lambda changes from 0.14 to 1.

78.5 Purpose

The PPCC Plot is used to find the shape parameter value which produces the best fit (i.e. highest correlation). If the Tukey Lambda PPCC Plot is computed, the value of \(\lambda\) may provide information about the symmetric distribution which fits the data best -- Table 78.1 shows how different values of \(\lambda\) correspond to symmetric distributions.

Warning

The Tukey-Lambda PPCC interpretation table is intended for symmetric distributions only. If the data are clearly skewed, the fitted \(\lambda\) value should not be interpreted as evidence for one of the symmetric families in Table 78.1.

Table 78.1: Tukey-Lambda PPCC Plot for symmetric distributions (source: NIST/SEMATECH (n.d.))

\(\lambda\)	Meaning
\(\lambda = -1\)	approximate Cauchy distribution
\(\lambda = 0\)	distribution is exactly logistic
\(\lambda = 0.14\)	distribution is approximately normal
\(\lambda = 0.5\)	distribution is reversed U-shaped (i.e. concave)
\(\lambda = 1\)	distribution is exactly uniform

78.6 Pros & Cons

78.6.1 Pros

The Tukey-Lambda PPCC Plot has the following advantages:

it provides useful information about the distributional shape of the data under investigation
it is easy to interpret

78.6.2 Cons

The Tukey-Lambda PPCC Plot has the following disadvantages:

there are only few software packages that allow this plot to be generated
most readers are not familiar with this plot
the plot is not suited for asymmetric distributions

78.7 Example

The following analysis shows the Tukey-Lambda PPCC Plot for the monthly marriages time series in Belgium. From this analysis it can be concluded that the Uniform Distribution has the best fit for the data (see also Table 78.1).

Interactive Shiny app (click to load).

Open in new tab

78.8 Task

Compute the Tukey-Lambda PPCC Plot for the monthly divorces time series and interpret the results. Why does the Divorces time series exhibit a distribution which is completely different from the marriages time series?