152 Estimating ARMA Parameters and Residual Diagnostics

This chapter implements the practical workflow introduced in Section 151.7: start from a general ARIMA specification, simplify it, and retain only models that pass residual diagnostics.

152.1 Example: Unemployment

In the ARIMA Backward Selection R module that can be found on the public website (https://compute.wessa.net/rwasp_arimabackwardselection.wasp) we use the following values:

\(\lambda = 0.5\)
d = 1
D = 1
p = 3 (maximum value)
q = 1 (maximum value)
P = 2 (maximum value)
Q = 1 (maximum value)

The values of p, q, P, and Q are unknown. In practice, we start with plausible maxima and estimate a general model first. Then we simplify iteratively by removing weak terms. In this handbook, two criteria are used jointly:

statistical significance of individual parameters (typical threshold: \(p < 0.05\)),
information criteria (AIC/BIC; Akaike (1974); Schwarz (1978)) to avoid overfitting.

The process stops when no further simplification improves the fit-complexity trade-off and residual diagnostics remain acceptable.

The R module that has been integrated into this handbook also allows to generate the same output (you need to set the appropriate values for \(\lambda\), d, and D):

Interactive Shiny app (click to load).

Open in new tab

According to the ARIMA Backward Selection procedure, the Unemployment time series contains an AR(2), MA(1), and SMA(1) process (p=2, q=1, P=0, Q=1). This implies that the complete ARIMA model can be written as follows:

\[ (1-\phi_1 B -\phi_2 B^2) \nabla \nabla_{12} \sqrt{Y_t} = (1 - \theta_1 B) (1 - \Theta_1 B^{12}) e_t \]

The parameter estimates can be observed in the coefficient chart of the app. In the Diagnostics tab, the key decision checks are:

residual ACF/PACF: no systematic significant spikes,
Ljung-Box p-value (Ljung and Box 1978): not significant at standard levels,
residual distribution plots (QQ, histogram): approximate symmetry and no severe tail problems.

If these checks fail, revisit the model orders and re-estimate.

152.1.1 Model Selection Criteria

For ARIMA estimation in this handbook, we use a combined decision rule:

Keep parameters that are practically meaningful and statistically supported (typical cutoff \(p < 0.05\)).
Prefer lower AIC (Akaike 1974) / BIC (Schwarz 1978) among diagnostically acceptable models.
Reject any model with clearly non-white residuals, even if AIC is slightly lower.

This avoids a common mistake: selecting a model only by significance or only by AIC, without checking residual adequacy.

152.1.2 Generic Local Template (AirPassengers placeholder)

If you prefer to compute ARIMA Backward Selection on your local machine, the following script is intentionally generic. It uses AirPassengers as a placeholder series so you can test the workflow quickly.

To replicate this chapter’s Unemployment example, replace the data line (x <- AirPassengers) with the Unemployment series and keep the same parameter settings.

library(lattice)
x <- AirPassengers  # placeholder dataset for the generic template
par1 = FALSE #Include mean?
par2 = 0.0 #Box-Cox lambda transformation parameter
par3 = 1 #degree of non-seasonal differencing
par4 = 1 #degree of seasonal differencing
par5 = 12 #seasonal period
par6 = 3 #degree (p) of the non-seasonal AR(p) polynomial
par7 = 1 #degree (q) of the non-seasonal MA(q) polynomial
par8 = 2 #degree (P) of the seasonal AR(P) polynomial
par9 = 1 #degree (Q) of the seasonal MA(Q) polynomial
armaGR <- function(arima.out, names, n){
  try1 <- arima.out$coef
  try2 <- sqrt(diag(arima.out$var.coef))
  try.data.frame  <- data.frame(matrix(NA,ncol=4,nrow=length(names)))
  dimnames(try.data.frame) <- list(names,c('coef','std','tstat','pv'))
  try.data.frame[,1] <- try1
  for(i in 1:length(try2)) try.data.frame[which(rownames(try.data.frame)==names(try2)[i]),2] <- try2[i]
  try.data.frame[,3] <- try.data.frame[,1] / try.data.frame[,2]
  try.data.frame[,4] <- round((1-pt(abs(try.data.frame[,3]),df=n-(length(try2)+1)))*2,5)
  vector <- rep(NA,length(names))
  vector[is.na(try.data.frame[,4])] <- 0
  maxi <- which.max(try.data.frame[,4])
  continue <- max(try.data.frame[,4],na.rm=TRUE) > .05
  vector[maxi] <- 0
  list(summary=try.data.frame,next.vector=vector,continue=continue)
}
arimaSelect <- function(series, order=c(13,0,0), seasonal=list(order=c(2,0,0),period=12), include.mean=F){
  nrc <- order[1]+order[3]+seasonal$order[1]+seasonal$order[3]
  coeff <- matrix(NA, nrow=nrc*2, ncol=nrc)
  pval <- matrix(NA, nrow=nrc*2, ncol=nrc)
  mylist <- rep(list(NULL), nrc)
  names  <- NULL
  if(order[1] > 0) names <- paste('ar',1:order[1],sep='')
  if(order[3] > 0) names <- c( names , paste('ma',1:order[3],sep='') )
  if(seasonal$order[1] > 0) names <- c(names, paste('sar',1:seasonal$order[1],sep=''))
  if(seasonal$order[3] > 0) names <- c(names, paste('sma',1:seasonal$order[3],sep=''))
  arima.out <- arima(series, order=order, 
  seasonal=seasonal, include.mean=include.mean, method='ML')
  mylist[[1]] <- arima.out
  last.arma <- armaGR(arima.out, names, length(series))
  mystop <- FALSE
  i <- 1
  coeff[i,] <- last.arma[[1]][,1]
  pval [i,] <- last.arma[[1]][,4]
  i <- 2
  aic <- arima.out$aic
  while(!mystop){
    mylist[[i]] <- arima.out
    arima.out <- arima(series, order=order, seasonal=seasonal, 
    include.mean=include.mean, method='ML', 
    fixed=last.arma$next.vector)
    aic <- c(aic, arima.out$aic)
    last.arma <- armaGR(arima.out, names, length(series))
    mystop <- !last.arma$continue
    coeff[i,] <- last.arma[[1]][,1]
    pval [i,] <- last.arma[[1]][,4]
    i <- i+1
  }
  list(coeff, pval, mylist, aic=aic)
}
arimaSelectplot <- function(arimaSelect.out,noms,choix){
  noms <- names(arimaSelect.out[[3]][[1]]$coef)
  coeff <- arimaSelect.out[[1]]
  k <- min(which(is.na(coeff[,1])))-1
  coeff <- coeff[1:k,]
  pval  <- arimaSelect.out[[2]][1:k,]
  aic   <- arimaSelect.out$aic[1:k]
  coeff[coeff==0] <- NA
  n <- ncol(coeff)
  if(missing(choix)) choix <- k
  layout(matrix(c(1,1,1,2,
                  3,3,3,2,
                  3,3,3,4,
                  5,6,7,7),nr=4),
         widths=c(10,35,45,15),
         heights=c(30,30,15,15))
  couleurs <- rainbow(75)[1:50]#(50)
  ticks <- pretty(coeff)
  op <- par(mar=c(1,1,3,1))
  plot(aic,k:1-.5,type='o',pch=21,bg='blue',cex=2,axes=F,lty=2,xpd=NA)
  points(aic[choix],k-choix+.5,pch=21,cex=4,bg=2,xpd=NA)
  title('aic',line=2)
  par(mar=c(3,0,0,0))
  plot(0,axes=F,xlab='',ylab='',xlim=range(ticks),ylim=c(.1,1))
  rect(xleft  = min(ticks) + (0:49)/50*(max(ticks)-min(ticks)),
       xright = min(ticks) + (1:50)/50*(max(ticks)-min(ticks)),
       ytop   = rep(1,50),
       ybottom= rep(0,50),col=couleurs,border=NA)
  axis(1,ticks)
  rect(xleft=min(ticks),xright=max(ticks),ytop=1,ybottom=0)
  text(mean(coeff,na.rm=T),.5,'coefficients',cex=2,font=2)
  par(mar=c(1,1,3,1))
  image(1:n,1:k,t(coeff[k:1,]),axes=F,col=couleurs,zlim=range(ticks))
  for(i in 1:n) for(j in 1:k) if(!is.na(coeff[j,i])) {
    if(pval[j,i]<.01)                            symb = 'green'
    else if( (pval[j,i]<.05) & (pval[j,i]>=.01)) symb = 'orange'
    else if( (pval[j,i]<.1)  & (pval[j,i]>=.05)) symb = 'red'
    else                                         symb = 'black'
    polygon(c(i+.5   ,i+.2   ,i+.5   ,i+.5),
            c(k-j+0.5,k-j+0.5,k-j+0.8,k-j+0.5),
            col=symb)
    if(j==choix)  {
      rect(xleft=i-.5,
           xright=i+.5,
           ybottom=k-j+1.5,
           ytop=k-j+.5,
           lwd=4)
      text(i,
           k-j+1,
           round(coeff[j,i],2),
           cex=1.2,
           font=2)
    }
    else{
      rect(xleft=i-.5,xright=i+.5,ybottom=k-j+1.5,ytop=k-j+.5)
      text(i,k-j+1,round(coeff[j,i],2),cex=1.2,font=1)
    }
  }
  axis(3,1:n,noms)
  par(mar=c(0.5,0,0,0.5))
  plot(0,axes=F,xlab='',ylab='',type='n',xlim=c(0,8),ylim=c(-.2,.8))
  cols <- c('green','orange','red','black')
  niv  <- c('0','0.01','0.05','0.1')
  for(i in 0:3){
    polygon(c(1+2*i   ,1+2*i   ,1+2*i-.5   ,1+2*i),
            c(.4      ,.7      , .4        , .4),
            col=cols[i+1])
    text(2*i,0.5,niv[i+1],cex=1.5)
  }
  text(8,.5,1,cex=1.5)
  text(4,0,'p-value',cex=2)
  box()
  residus <- arimaSelect.out[[3]][[choix]]$res
  par(mar=c(1,2,4,1))
  acf(residus,main='')
  title('acf',line=.5)
  par(mar=c(1,2,4,1))
  pacf(residus,main='')
  title('pacf',line=.5)
  par(mar=c(2,2,4,1))
  qqnorm(residus,main='')
  title('qq-norm',line=.5)
  qqline(residus)
  residus
}
if (par2 == 0) x <- log(x)
if (par2 != 0) x <- x^par2
selection <- arimaSelect(x, order=c(par6,par3,par7), seasonal=list(order=c(par8,par4,par9), period=par5))
selection[[1]] # print parameter values
selection[[2]] # print p-values
op <- par()
resid <- arimaSelectplot(selection)

par(op)
acf(resid,length(resid)/2, main='Residual Autocorrelation Function')

pacf(resid,length(resid)/2, main='Residual Partial Autocorrelation Function')

cpgram(resid, main='Residual Cumulative Periodogram')

hist(resid, main='Residual Histogram', xlab='values of Residuals')

plot(density(resid),col='black',main='Residual Density Plot', xlab='values of Residuals')

qqnorm(resid, main='Residual Normal Q-Q Plot')
qqline(resid)

           [,1]      [,2]        [,3]       [,4]        [,5]        [,6]
 [1,] 0.2445976 0.1087535 -0.08699510 -0.6399705 -0.07894941 -0.01776128
 [2,] 0.2399690 0.1078988 -0.08810641 -0.6358814 -0.06138126  0.00000000
 [3,] 0.2446140 0.1085345 -0.09603517 -0.6342239  0.00000000  0.00000000
 [4,] 0.0553323 0.0000000 -0.11985784 -0.4380604  0.00000000  0.00000000
 [5,] 0.0000000 0.0000000 -0.12424125 -0.3908358  0.00000000  0.00000000
 [6,] 0.0000000 0.0000000  0.00000000 -0.4018280  0.00000000  0.00000000
 [7,]        NA        NA          NA         NA          NA          NA
 [8,]        NA        NA          NA         NA          NA          NA
 [9,]        NA        NA          NA         NA          NA          NA
[10,]        NA        NA          NA         NA          NA          NA
[11,]        NA        NA          NA         NA          NA          NA
[12,]        NA        NA          NA         NA          NA          NA
[13,]        NA        NA          NA         NA          NA          NA
[14,]        NA        NA          NA         NA          NA          NA
            [,7]
 [1,] -0.5091495
 [2,] -0.5257727
 [3,] -0.5666181
 [4,] -0.5555164
 [5,] -0.5525522
 [6,] -0.5569448
 [7,]         NA
 [8,]         NA
 [9,]         NA
[10,]         NA
[11,]         NA
[12,]         NA
[13,]         NA
[14,]         NA
         [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]
 [1,] 0.45360 0.47900 0.42343 0.04469 0.71814 0.90545 0.01169
 [2,] 0.45495 0.48042 0.41314 0.04303 0.69664      NA 0.00018
 [3,] 0.44971 0.47614 0.36383 0.04626      NA      NA 0.00000
 [4,] 0.80797      NA 0.18545 0.03580      NA      NA 0.00000
 [5,]      NA      NA 0.15402 0.00000      NA      NA 0.00000
 [6,]      NA      NA      NA 0.00002      NA      NA 0.00000
 [7,]      NA      NA      NA      NA      NA      NA      NA
 [8,]      NA      NA      NA      NA      NA      NA      NA
 [9,]      NA      NA      NA      NA      NA      NA      NA
[10,]      NA      NA      NA      NA      NA      NA      NA
[11,]      NA      NA      NA      NA      NA      NA      NA
[12,]      NA      NA      NA      NA      NA      NA      NA
[13,]      NA      NA      NA      NA      NA      NA      NA
[14,]      NA      NA      NA      NA      NA      NA      NA

Interpretation checklist for the local diagnostics:

ACF/PACF of residuals: no systematic significant spikes should remain.
Cumulative periodogram: residual spectrum should stay close to the reference band (no dominant unexplained frequencies).
Histogram + density: residuals should look roughly symmetric without extreme tail concentration.
QQ plot: points should stay close to the line except for minor tail deviations.

If these checks fail, revisit transformation and differencing first, then re-evaluate ARIMA orders.

152.2 Example: Births

The Births series illustrates why estimation should be repeated after alternative stationarity settings:

Model 1 and Model 2: no stable ARIMA structure retained after simplification.
Model 3: AR(1), MA(1), SAR(1), and SMA(1) are retained.

For Model 3 the complete ARIMA specification is:

\[ (1-\phi_1 B) (1-\Phi_1 B^{12}) \nabla_{12} Y_t = (1 - \theta_1 B) (1 - \Theta_1 B^{12}) e_t \]

The practical interpretation is that both short-run and seasonal dependence remain relevant once the seasonal differencing step has been chosen correctly.

152.3 Example: Soldiers

For Soldiers, the selected model retains only an MA(1) component (p=0, q=1, P=0, Q=0), so:

\(\nabla Y_t = (1 - \theta_1 B^1) e_t\).

This is consistent with a series where first differencing removes most persistence and only a short memory shock component remains.

152.4 Example: Traffic

For Traffic, model comparison gives:

Model 1: no stable ARIMA terms retained; the series remains effectively non-stationary under that setup.
Model 2: AR(1) and MA(1) retained (p=1, q=1, P=0, Q=0).

This contrast shows why stationarity induction and transformation choices must be validated before interpreting ARIMA coefficients.