150 Stationarity

In a first step we need to induce stationarity in the mean and the variance of the time series under investigation. In practice, most time series do not satisfy the stationarity conditions which are required to apply univariate forecasting models. Therefore, these times series are called non-stationary and should be differenced and transformed such that they become stationary with respect to the variance and the mean.

150.1 Stationarity of the mean

With the use of the Autocorrelation Function (ACF) it is possible to detect non-stationarity of the time series with respect to the mean level. As an alternative, it is possible to use the Cumulative Periodogram (CP) to identify non-stationarity. A third diagnostic tool is the so-called Variance Reduction Matrix (VRM) which lists the variances of the time series after several combinations of non-seasonal and seasonal differencing have been applied. As a general rule, the optimal degree of differencing induces stationarity in the mean with a minimum variance.

Stationarity of the mean can be induced by using the backshift and differencing operators. The backshift operator introduces time lags as is illustrated in the following examples:

\(B Y_t = Y_{t-1}\)
\(B^k Y_t = Y_{t-k}\)
\(B_s Y_t = Y_{t-s}\)
\(B^k_s Y_t = Y_{t-k s}\)

The differencing operator is called nabla and transforms the time series in terms of past changes:

\(\nabla Y_t = (1 - B) Y_t = Y_t - Y_{t-1}\)
\(\nabla^2 Y_t = (1 - B) (1 - B) Y_t = (1 - B) (Y_t - Y_{t-1}) \\ = Y_t - Y_{t-1} - Y_{t-1} + Y_{t-2} = Y_t - 2 Y_{t-1} + Y_{t-2}\)
\(\nabla_s Y_t = (1 - B_s) Y_t = Y_t - Y_{t-s}\)
\(\nabla \nabla_s Y_t = (1 - B) (Y_t - Y_{t-s}) = Y_t - Y_{t-1} - Y_{t-s} + Y_{t-s-1}\)

In practice, we apply the following differencing operators to induce stationarity in the mean:

\(\nabla^d \nabla^D_s Y_t = W_t\)

where

\(Y_t\) is the original (raw) time series
\(d\) is the degree or order of non-seasonal differencing
\(D\) is the degree or order of seasonal differencing
\(s\) is the seasonal period
\(W_t\) represents the stationary (working) time series

Differencing can be applied to remove non-seasonal and/or seasonal trends. This procedure works reasonably well for a wide range of time series that are naturally observed. Hence, there are only two parameters that need to be determined to induce stationarity in the mean in most commonly encountered time series: the degree of non-seasonal differencing \(d\) and the degree of seasonal differencing \(D\).

As a general rule, we can detect a non-seasonal stochastic trend (unit-root-type behavior) in the ACF if the sequence of the first \(s/2\) autocorrelations exhibits a slowly decreasing pattern. The degree of non-seasonal differencing \(d\) is increased by one unit as long as this pattern is observed. In most cases, the pattern will disappear after a single round of non-seasonal differencing has been applied (i.e. \(d=1\)).

We can detect a seasonal trend in the ACF if the sequence of autocorrelations at lags \({s, 2s, 3s}\) exhibits a slowly decreasing pattern. The degree of seasonal differencing \(D\) is increased by one unit as long as this pattern is observed. In most cases, the pattern will disappear after a single round of seasonal differencing has been applied (i.e. \(D=1\)).

In practice, we will often encounter time series that can be made stationary in the mean through differencing with \(d \in {0, 1, 2}\) and \(D \in {0,1}\).

In addition, we can use the CP to identify seasonal and non-seasonal trend components whenever the CP line exhibits large (step-wise) increases which correspond to the long term or seasonal periods.

150.1.1 Example: Unemployment

The Unemployment time series should be differenced with the non-seasonal differencing operator because the ACF is slowly decreasing, as is shown in the output:

Interactive Shiny app (click to load).

Open in new tab

Hence, we set the slider to \(d=1\) (degree of non-seasonal

differencing = 1) and recompute the ACF. The result indicates that the time series also contains a seasonal trend (observe how the ACF at seasonal time lags is slowly decreasing).

Therefore, we must set \(d=D=1\) (\(D\) is the degree of seasonal differencing) by moving the seasonal differencing slider and recompute the ACF.

The ACF (with \(d=D=1\)) suggests that the time series \(\nabla \nabla_{12} Y_t\) is stationary in the mean. This will be verified through spectral analysis in order to gain more confidence in the degrees of non-seasonal and seasonal differencing.

Interactive Shiny app (click to load).

Open in new tab

The output shows the CP of the original time series and indicates a non-seasonal stochastic trend because the CP value increases sharply at the frequency of the longest period (i.e. on the left side of the chart).

If we apply non-seasonal differencing (\(d=1\)) then the CP indicates the presence of a seasonal trend. The CP (with \(d=D=1\) and \(s=12\)) suggests that the time series \(\nabla \nabla_{12} Y_t\) is stationary in the mean.

Finally, we examine the VRM (Variance Reduction Matrix) with seasonal period \(s = 12\). The analysis confirms the above findings because the lowest (trimmed) variance can be found for \(d=D=1\):

Interactive Shiny app (click to load).

Open in new tab

150.1.2 Example: Births

We examine the following computations to find the appropriate values for d and D:

Interactive Shiny app (click to load).

Open in new tab

Interactive Shiny app (click to load).

Open in new tab

Interactive Shiny app (click to load).

Open in new tab

The ACF and CP seem to suggest \(D=0\) and \(d=0\) (even though there could be doubt about \(d=1\)). The VRM, however, suggests \(D=1\) and \(d=0\). There seems to be a discrepancy between the three diagnostic tools. Therefore, we formulate three alternative models:

Model 1: \(Y_t\)
Model 2: \(\nabla Y_t\)
Model 3: \(\nabla_{12} Y_t\)

150.1.3 Example: Soldiers

We examine the following computations to find the appropriate values for d and D:

Interactive Shiny app (click to load).

Open in new tab

Interactive Shiny app (click to load).

Open in new tab

Interactive Shiny app (click to load).

Open in new tab

All three diagnostics suggest that \(D=0\) and \(d=1\). Therefore we can conclude that \(\nabla Y_t\) is stationary in the mean.

150.1.4 Example: Traffic

The examination of the Traffic data can be obtained in a similar way. Hint: this time series has no seasonality. Therefore, it is not possible to find any seasonal effects.

The ACF and VRM suggest that \(D=0\) and \(d=1\). Based on the CP we may doubt whether \(d=1\) is appropriate or not. Therefore, we formulate two alternative models:

Model 1: \(Y_t\)
Model 2: \(\nabla Y_t\)

150.1.5 Example: Pageviews

The examination of the Pageviews data can be obtained in a similar way. Note: the pageviews time series has a daily sampling frequency. Therefore we should use s=7 instead of s=12¹.

150.2 Stationarity of the variance

150.2.1 Transformation of time series

If we write a time series \(Y_t\) as the sum of a deterministic mean and a disturbance term

\[ Y_t = \mu_t + e_t \]

then the relationship between V\((Y_t)\) and \(\mu_t\) may be of the form

\[ \text{V}(Y_t) = \sigma^2 h^2(\mu_t) \]

where \(h\) is an arbitrary function.

The time series \(Y_t\) must therefore be transformed in order to stabilize the variance. Denote the transformed series by \(g(Y_t)\) and expand it using a Taylor series around \(\mu_t\)

\[ g(Y_t) \simeq g(\mu_t) + (Y_t - \mu_t)g'(\mu_t) \]

This can be used to obtain the variance of the transformed series

\[ \begin{aligned} \text{V}(g( Y_t )) &\simeq \text{V}\left( g(\mu_t) + (Y_t - \mu_t) g'(\mu_t) \right) \\ &\simeq \left( g'(\mu_t) \right)^2 \text{V}(Y_t)\\ &\simeq \left( g'(\mu_t) \right)^2 h^2(\mu_t) \sigma^2 \end{aligned} \]

which implies that the variance can be stabilized by imposing

\[ g'(\mu_t) = \frac{1}{h(\mu_t)} \]

Accordingly, if the standard deviation of the series is proportional to the mean level (\(\sigma_{Y_t} \propto \mu_t\)) then

\[ h(\mu_t) = \mu_t \Rightarrow g'(\mu_t) = \frac{1}{\mu_t} \]

from which it follows that

\[ g(\mu_t) = \ln \mu_t \]

In case the variance of the time series is proportional to the mean level then

\[ h(\mu_t) = \sqrt{\mu_t} \Rightarrow g'(\mu_t) = \frac{1}{\sqrt{\mu_t}} \]

from which it follows that

\[ g(\mu_t) = 2 \sqrt{\mu_t} \]

In the Standard Deviation-Mean Plot, the functional relationship is assumed to be as follows

\[ \sigma_{Y_t} = \alpha \mu_{Y_t}^{1-\lambda} \]

The value of \(\lambda\) is the parameter of the so-called “simple” Box-Cox transformation (Box and Cox 1964)

\[ \begin{cases}Y_t^\lambda \text{ for } \lambda \neq 0 \\\ln Y_t \text{ for } \lambda = 0\end{cases} \]

Depending on the type of relationship between the Arithmetic Mean and Standard Deviation a different value of \(\lambda\) will be chosen as is shown in Figure 150.1.

150.2.2 Standard Deviation-Mean Plot (revisited)

We examine the relationship between the mean and the standard deviation of the time series in order to detect a common form of heteroskedasticity which can be easily removed by the use of a (simplified) Box–Cox transform that is defined as follows: \(Y_t^{\lambda}\) for \(\lambda \neq 0\) and \(\ln Y_t\) for \(\lambda = 0\).

The Standard Deviation-Mean Plot allows us to identify whether a transformation is necessary and it also provides an estimate for \(\lambda\). Note: it is not always possible to find appropriate values for \(\lambda\). In addition, there is no guarantee that the Box-Cox transform allows us to induce stationarity of the Variance. There are many other types of transformation and analysis that might be useful in this respect -- these, however, are beyond the scope of this book.

Formally, the SMP computes two Simple Linear Regression Models based on the Standard Deviation and Arithmetic Mean of sequential blocks. The first model is

\[ \sigma_i = \alpha + \beta \mu_i + \epsilon_i \]

for \(i = 1, 2, …, k\), where \(\sigma_i\) is the Standard Deviation, \(\mu_i\) is the Arithmetic Mean, and \(k\) is the number of sequential blocks. In most cases, we use a “block width” which is equal to the seasonal period \(s\) (e.g. \(s=12\) for monthly time series).

The Hypothesis Test which is used to decide whether a Box-Cox transformation is required, is formulated as follows

\[ \begin{cases}\text{H}_0: \beta = 0 \\\text{H}_A: \beta \neq 0\end{cases} \]

unless we have prior knowledge about the relationship between \(\sigma_i\) and \(\mu_i\).

If we reject the Null Hypothesis then we decide that a Box-Cox transformation is required, i.e. \(\lambda \neq 1\).

The second Simple Linear Regression Model is only required if the Null Hypothesis H\(_0: \beta = 0\) is rejected. It can be shown that the (quasi-)optimal value for \(\lambda\) can be obtained as follows:

\[ \lambda = 1 - \beta \]

where \(\beta\) is obtained through

\[ \ln \sigma_i = \alpha + \beta \ln \mu_i + \epsilon_i \]

which (implicitly) assumes that \(\forall i = 1, 2, …,k: \mu_i > 0\). If any local mean \(\mu_i \leq 0\) then we simply add a constant \(c\) to all observations such that no negative local means remain.

150.2.3 Box-Cox Normality Plot

150.2.3.1 Definition

As an alternative for the SMP it is sometimes useful to employ the Box-Cox Normality Plot which attempts to estimate the (quasi-)optimal value of \(\lambda\) based on the so-called Maximum Likelihood Estimation procedure.

The Box-Cox Normality Plot features two flavors of the Box-Cox transformation: the so-called “full” version and the “simplified” version.

150.2.3.1.1 Full Box-Cox Transformation

The full Box-Cox transformation is defined as follows

\[ \begin{cases}\frac{sign(Y_t)|Y_t|^\lambda-1}{\lambda} \text{ for } \lambda \neq 0 \\\ln Y_t \text{ for } \lambda = 0\end{cases} \]

which is the default setting for the Box-Cox Normality Plot.

150.2.3.1.2 Simple Box-Cox Transformation

The simplified Box-Cox transformation has already been defined in the SMP procedure.

150.2.3.2 Horizontal axis

The horizontal axis of the Box-Cox Normality Plot shows the values of \(\lambda\).

150.2.3.3 Vertical axis

The vertical axis of the Box-Cox Normality Plot shows the correlation of the Normal QQ Plot.

150.2.3.4 R Module

The Box-Cox Normality Plot is available on the public website:

https://compute.wessa.net/rwasp_boxcoxnorm.wasp

The same R module is also available (when using the default profile) in RFC under the “`Distributions / Box-Cox Normality Plot” menu item.

If you prefer to compute the Box-Cox Normality Plot on your local machine, the following script can be used in the R console:

library(car)
x <- AirPassengers
par1 = 'Full Box-Cox transform' #Type of transformation
par2 = -2 #Minimum lambda
par3 = 2 #Maximum lambda
par4 = 0 #Constant term to be added before analysis is performed
par5 = 'No' #Display table with original and transformed data
par2 <- abs(par2*100)
par3 <- par3*100
numlam <- par2 + par3 + 1
x <- x + par4
n <- length(x)
c <- array(NA,dim=c(numlam))
l <- array(NA,dim=c(numlam))
mx <- -1
mxli <- -999
for (i in 1:numlam) {
  l[i] <- (i-par2-1)/100
  if (l[i] != 0)
  {
    if (par1 == 'Full Box-Cox transform') x1 <- (x^l[i] - 1) / l[i]
    if (par1 == 'Simple Box-Cox transform') x1 <- x^l[i]
  } else {
    x1 <- log(x)
  }
  c[i] <- cor(qnorm(ppoints(x), mean=0, sd=1),sort(x1))
  if (mx < c[i]) {
    mx <- c[i]
    mxli <- l[i]
    x1.best <- x1
  }
}
print(c) #correlations
print(mx) #maximum correlation
print(mxli) #lambda value of maximum correlation
print(x1.best)
if (mxli != 0) {
  if (par1 == 'Full Box-Cox transform') x1 <- (x^mxli - 1) / mxli
  if (par1 == 'Simple Box-Cox transform') x1 <- x^mxli
} else {
  x1 <- log(x)
}
#Maximum Likelihood approach to find optimal lambda
mypT <- powerTransform(x)
summary(mypT)
plot(l,c,main='Box-Cox Normality Plot', xlab='Lambda',ylab='correlation')
mtext(paste('Optimal Lambda =',mxli))
grid()

hist(x,main='Histogram of Original Data',xlab='X',ylab='frequency')
grid()

hist(x1,main='Histogram of Transformed Data', xlab='X',ylab='frequency')
grid()

qqPlot(x)
grid()
mtext('Original Data')

qqPlot(x1)
grid()
mtext('Transformed Data')

  [1] 0.9102139 0.9107995 0.9113842 0.9119681 0.9125511 0.9131331 0.9137143
  [8] 0.9142945 0.9148737 0.9154520 0.9160293 0.9166056 0.9171809 0.9177552
 [15] 0.9183284 0.9189006 0.9194717 0.9200416 0.9206105 0.9211783 0.9217449
 [22] 0.9223103 0.9228746 0.9234376 0.9239995 0.9245601 0.9251195 0.9256776
 [29] 0.9262345 0.9267900 0.9273443 0.9278972 0.9284487 0.9289990 0.9295478
 [36] 0.9300952 0.9306412 0.9311858 0.9317289 0.9322706 0.9328108 0.9333495
 [43] 0.9338866 0.9344223 0.9349563 0.9354888 0.9360197 0.9365490 0.9370767
 [50] 0.9376027 0.9381271 0.9386498 0.9391708 0.9396901 0.9402076 0.9407234
 [57] 0.9412375 0.9417497 0.9422601 0.9427688 0.9432755 0.9437804 0.9442835
 [64] 0.9447846 0.9452839 0.9457812 0.9462765 0.9467699 0.9472613 0.9477507
 [71] 0.9482380 0.9487234 0.9492066 0.9496878 0.9501669 0.9506439 0.9511187
 [78] 0.9515914 0.9520620 0.9525303 0.9529965 0.9534604 0.9539220 0.9543815
 [85] 0.9548386 0.9552934 0.9557460 0.9561962 0.9566440 0.9570895 0.9575326
 [92] 0.9579733 0.9584115 0.9588474 0.9592807 0.9597116 0.9601400 0.9605659
 [99] 0.9609893 0.9614101 0.9618283 0.9622440 0.9626571 0.9630675 0.9634753
[106] 0.9638805 0.9642830 0.9646828 0.9650799 0.9654743 0.9658659 0.9662548
[113] 0.9666409 0.9670243 0.9674048 0.9677825 0.9681573 0.9685293 0.9688984
[120] 0.9692647 0.9696280 0.9699884 0.9703458 0.9707004 0.9710519 0.9714004
[127] 0.9717460 0.9720885 0.9724280 0.9727644 0.9730977 0.9734280 0.9737552
[134] 0.9740793 0.9744002 0.9747180 0.9750326 0.9753440 0.9756523 0.9759574
[141] 0.9762592 0.9765578 0.9768531 0.9771452 0.9774340 0.9777195 0.9780018
[148] 0.9782806 0.9785562 0.9788284 0.9790972 0.9793627 0.9796248 0.9798835
[155] 0.9801387 0.9803906 0.9806390 0.9808839 0.9811254 0.9813634 0.9815979
[162] 0.9818289 0.9820564 0.9822803 0.9825008 0.9827176 0.9829309 0.9831407
[169] 0.9833468 0.9835494 0.9837483 0.9839437 0.9841354 0.9843234 0.9845078
[176] 0.9846886 0.9848657 0.9850391 0.9852088 0.9853748 0.9855371 0.9856957
[183] 0.9858505 0.9860017 0.9861491 0.9862927 0.9864325 0.9865686 0.9867009
[190] 0.9868295 0.9869542 0.9870751 0.9871922 0.9873055 0.9874150 0.9875207
[197] 0.9876225 0.9877204 0.9878145 0.9879048 0.9879912 0.9880737 0.9881523
[204] 0.9882271 0.9882979 0.9883649 0.9884280 0.9884871 0.9885424 0.9885937
[211] 0.9886412 0.9886847 0.9887242 0.9887599 0.9887916 0.9888194 0.9888432
[218] 0.9888631 0.9888790 0.9888910 0.9888991 0.9889031 0.9889033 0.9888994
[225] 0.9888916 0.9888798 0.9888641 0.9888444 0.9888207 0.9887931 0.9887615
[232] 0.9887259 0.9886863 0.9886428 0.9885953 0.9885438 0.9884883 0.9884289
[239] 0.9883654 0.9882981 0.9882267 0.9881514 0.9880721 0.9879888 0.9879015
[246] 0.9878103 0.9877151 0.9876160 0.9875129 0.9874058 0.9872948 0.9871798
[253] 0.9870609 0.9869380 0.9868112 0.9866804 0.9865457 0.9864071 0.9862645
[260] 0.9861180 0.9859676 0.9858132 0.9856550 0.9854928 0.9853267 0.9851568
[267] 0.9849829 0.9848051 0.9846234 0.9844379 0.9842485 0.9840552 0.9838581
[274] 0.9836571 0.9834522 0.9832435 0.9830310 0.9828146 0.9825944 0.9823704
[281] 0.9821426 0.9819109 0.9816755 0.9814363 0.9811933 0.9809465 0.9806960
[288] 0.9804417 0.9801837 0.9799219 0.9796564 0.9793872 0.9791142 0.9788376
[295] 0.9785572 0.9782732 0.9779855 0.9776941 0.9773990 0.9771003 0.9767980
[302] 0.9764920 0.9761825 0.9758693 0.9755525 0.9752321 0.9749081 0.9745806
[309] 0.9742495 0.9739149 0.9735768 0.9732351 0.9728899 0.9725412 0.9721890
[316] 0.9718333 0.9714742 0.9711116 0.9707456 0.9703761 0.9700032 0.9696270
[323] 0.9692473 0.9688642 0.9684778 0.9680880 0.9676949 0.9672984 0.9668987
[330] 0.9664956 0.9660892 0.9656795 0.9652666 0.9648504 0.9644310 0.9640084
[337] 0.9635825 0.9631535 0.9627213 0.9622858 0.9618473 0.9614056 0.9609607
[344] 0.9605128 0.9600617 0.9596076 0.9591504 0.9586901 0.9582268 0.9577604
[351] 0.9572911 0.9568187 0.9563434 0.9558651 0.9553838 0.9548996 0.9544124
[358] 0.9539224 0.9534294 0.9529336 0.9524349 0.9519333 0.9514290 0.9509217
[365] 0.9504117 0.9498989 0.9493834 0.9488650 0.9483439 0.9478201 0.9472936
[372] 0.9467644 0.9462325 0.9456979 0.9451607 0.9446209 0.9440784 0.9435334
[379] 0.9429857 0.9424355 0.9418827 0.9413274 0.9407695 0.9402092 0.9396463
[386] 0.9390810 0.9385133 0.9379430 0.9373704 0.9367953 0.9362179 0.9356381
[393] 0.9350559 0.9344713 0.9338845 0.9332953 0.9327038 0.9321101 0.9315140
[400] 0.9309158 0.9303153
[1] 0.9889033
[1] 0.22
           Jan       Feb       Mar       Apr       May       Jun       Jul
1949  8.289824  8.438033  8.762263  8.695127  8.509943  8.828220  9.101473
1950  8.364682  8.626761  8.956776  8.828220  8.603691  9.121706  9.523962
1951  9.040128  9.141833  9.667021  9.394411  9.560211  9.667021 10.020032
1952  9.542128  9.702000  9.922260  9.719376  9.753904 10.315193 10.491415
1953  9.971438  9.971438 10.576849 10.562729 10.477008 10.674407 10.954491
1954 10.099767  9.838955 10.562729 10.448045 10.548562 10.954491 11.419910
1955 10.660605 10.534347 10.993071 11.018603 11.031313 11.568630 12.089424
1956 11.205517 11.119274 11.591083 11.546066 11.602268 12.188904 12.558098
1957 11.568630 11.408265 12.008293 11.925727 11.998052 12.639408 13.010198
1958 11.841667 11.602268 12.069272 11.925727 12.079359 12.754500 13.221592
1959 12.049034 11.862826 12.493896 12.400665 12.621457 13.068001 13.656123
1960 12.594405 12.353359 12.612456 12.976862 13.068001 13.560238 14.170466
           Aug       Sep       Oct       Nov       Dec
1949  9.101473  8.849951  8.462160  8.082257  8.438033
1950  9.523962  9.299192  8.784378  8.339900  8.935650
1951 10.020032  9.771058  9.375552  9.060686  9.450454
1952 10.660605 10.177993  9.889142  9.560211  9.938718
1953 11.056625 10.590923 10.208874  9.702000 10.052112
1954 11.313998 10.889426 10.477008 10.083943 10.477008
1955 11.915302 11.534741 11.081791 10.590923 11.131698
1956 12.484654 11.998052 11.466193 11.043987 11.466193
1957 13.026782 12.475394 11.915302 11.454667 11.799058
1958 13.331825 12.475394 12.038882 11.512007 11.809747
1959 13.735881 12.993558 12.503121 12.069272 12.484654
1960 14.063471 13.355135 12.976862 12.343841 12.728181
bcPower Transformation to Normality 
  Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd
x     0.148           0      -0.2374       0.5335

Likelihood ratio test that transformation parameter is equal to 0
 (log transformation)
                            LRT df    pval
LR test, lambda = (0) 0.5662479  1 0.45175

Likelihood ratio test that no transformation is needed
                           LRT df       pval
LR test, lambda = (1) 18.62702  1 1.5895e-05
[1] 139 140
[1] 139 140

150.2.3.5 Example

Consider the Airline time series which clearly exhibits a typical form of heteroskedasticity that can be removed through a Box-Cox transformation. The Box-Cox Normality Plot is shown in the following analysis and suggests that \(\lambda \simeq 0.22\) should be appropriate to induce normality²:

Interactive Shiny app (click to load).

Open in new tab

The output of the ML estimation shows that \(\lambda\) is (approximately) 0.148 with a 95% confidence interval of [-0.2374, 0.5335] which implies that \(\lambda\) is significantly different from 1 (\(p \simeq 1.59e-05\)) but not significantly different from 0 (\(p \simeq 0.45\)).

From the information shown above it is clear that the Box-Cox transform is able to induce approximate normality in the time series.

150.2.4 Example: Unemployment

Based on the SMP we identify the value for \(\lambda\) of the Box-Cox transformation.

Interactive Shiny app (click to load).

Open in new tab

The SMP shows that there is a significant relationship between the standard deviation and the mean of each year. The slope (\(\beta\)) of the regression line is significantly different from zero (\(p \simeq 0.0038\)), i.e. the Null Hypothesis H\(_0: \beta = 0\) is rejected. Hence it is necessary to apply a transformation which induces stationarity of the Variance. The quasi-optimal \(\lambda\) value is computed in the second regression model and is (approximately) equal to 0.47, which could be rounded to 0.5 (this corresponds to \(\sqrt{Y_t}\)). It is possible to apply this rounding because the Standard Deviation of \(\beta\) in the second model is (approximately) 0.115 which implies that \(\beta_0 = 0.5\) is contained in the 2-\(\sigma\) confidence interval around lambda: \(0.5 \in [0.467 - 2 \times 0.115, 0.467 + 2 \times 0.115]\), i.e. \(\lambda\) is not significantly different from 0.5.

We conclude that the Unemployment time series can be transformed in order to induce stationarity of the variance (\(\lambda = 0.5\)).

150.2.5 Example: Births

Based on the SMP we identify the value for \(\lambda\) of the Box-Cox transformation.

Interactive Shiny app (click to load).

Open in new tab

The SMP shows that there is no relationship between the standard deviation and the mean of subsequent years. Therefore, there is no need to apply the Box–Cox transformation (\(\lambda = 1\)).

150.2.6 Example: Soldiers

Based on the SMP we identify the value for \(\lambda\) of the Box-Cox transformation.

Interactive Shiny app (click to load).

Open in new tab

The SMP shows that there is no relationship between the standard deviation and the mean of sequential years. Therefore, there seems to be no need to apply the Box–Cox transformation (\(\lambda = 1\)).

Unfortunately, the SMP analysis for this time series might be somewhat misleading because the relationship between \(\sigma_i\) and \(\mu_i\) (for \(i = 1, 2, …,k\)) is probably not a linear one. The analysis clearly shows that the last year of the time series is drastically different from the previous years. Both \(\sigma_k\) and \(\mu_k\) are very small compared to the other years (the decision to withdraw troops from Iraq clearly resulted in fewer casualties). This can also be observed in the scatter plots of the analysis: the last year is shown in the bottom left area of the graph. If we would eliminate the last year from the computation, we would probably see a completely different regression line (one with a slightly negative slope).

We conclude that the SMP analysis is not always suited as a tool to find (quasi-)optimal values for \(\lambda\). The Simple Linear Regression Model can be (very) sensitive to outliers which may lead to wrong conclusions. We should always keep in mind that statistical methods make assumptions (which are not always satisfied).

150.2.7 Example: Traffic

Based on the SMP (with seasonality set to 7), we identify the value for \(\lambda\) of the Box-Cox transformation.

The SMP shows that there is a relationship between the standard deviation and the mean of sequential years. The slope (\(\beta\)) of the regression line is significantly different from zero (\(p \simeq 2e-16\)). The quasi-optimal \(\lambda\) value in the second regression equals -0.28 which cannot be rounded to 0 because the standard deviation is 0.08. In other words, the Traffic time series can be transformed in order to induce stationarity of the variance.

150.2.8 Example: Pageviews

To compute the SMP for the Pageviews time series we have to use a seasonality parameter of 7 instead of 12.

The SMP indicates a strong relationship between the standard deviation and the mean of sequential years. The slope (\(\beta\)) of the regression line is significantly different from zero (\(p < 2e-16\)). The quasi-optimal \(\lambda\) value can also be computed in a second regression (0.11) which must not be rounded to 0 because the standard deviation is only 0.036. In other words, the Pageviews time series should be transformed with \(\lambda = 0.11\) (i.e. \(Y_t^{0.11}\)) in order to induce stationarity.

150.3 Why do we need stationarity?

The most fundamental justification for time series analysis (as described in this chapter) is due to Wold’s decomposition theorem (Wold 1938). In practical terms, it states that a stationary time series can be decomposed into two parts: (1) a predictable component that is fully determined by past information, and (2) an innovation component driven by current and past shocks. The innovation component is generally represented as a one-sided moving-average expansion (not necessarily a finite-order MA model).

What does this all mean? In simple words, once stationarity has been induced, AR/MA/ARMA models provide useful finite approximations to the underlying dynamics. There is no general theorem guaranteeing that every arbitrary time series can be transformed into a stationary one. In applied work, however - especially for many economic time series - approximate stationarity can often be induced through transformations and differencing, but this should always be verified with diagnostics.

Actually, the time series could have multiple levels of seasonality (weekly, monthly, and annually). In this case, however, we limit ourselves to the investigation of weekly seasonality only.↩︎
It is often the case that transforming data towards normality also helps to induce stationarity of the variance. Strictly speaking, however, normality and stationarity (of the variance) are not the same.↩︎