One Way Analysis of Variance is a natural extension of the Unpaired Two Sample t-Test and is used when there are more than two independent samples. Each sample can be thought of as corresponding to a different treatment group. Formally, we can write the Hypothesis Test as follows:
where \(k\) is the number of independent samples or treatment groups.
In other words, we test whether there are at least two groups with different means. Imagine, there are three or more medical treatments available, one could use the One Way ANOVA test to quickly determine whether there is an effect (of treatment) or not. The test does not necessarily provide useful information about which treatment might be beneficial or not.
If we need detailed information about the effects of individual groups or treatments, it is necessary to compute a series of Unpaired Two Sample t-Tests for each pair of treatments or groups. Since this is a rather tedious/repetitive undertaking, we will use software which does this automatically. In other words, we will (even though this is not part of the ANOVA procedure) also be testing the following Hypothesis Tests:
for all combinations \(i,j = 1, 2, …, k\) where \(i \neq j\) and \(k\) is equal to the number of independent samples.
The problem with performing a sequence of statistical Hypothesis Tests, however, is that this results in an increasing probability of making type I errors. Indeed, each test involves a type I error of \(\alpha\) and when we use \(\frac{k^2 -k}{2}\) t-Tests in sequence (this is the number of pairs that can be formed with \(k\) samples) we will end up with a cumulative type I error which is much higher than the originally anticipated \(\alpha\).
For this reason we must take into account the total number of pairwise comparisons (\(\frac{k^2 -k}{2}\)) that is computed. The p-values should be inflated in order to make sure that the overall type I error reflects the chosen \(\alpha\) level. The method that is employed in the software that we use (see next section) is called “Tukey’s Honestly Significant Differences Test” (Tukey 1949) and includes the two-sided 95% confidence intervals and p-values for each pair of samples or treatments.
The ANOVA test statistic is
\[
F = \frac{MS_{between}}{MS_{within}} = \frac{SS_{between}/(k-1)}{SS_{within}/(N-k)},
\]
which is compared with an \(F\) distribution with \((k-1, N-k)\) degrees of freedom.
126.2 Analysis based on p-values and confidence intervals
126.2.1 Software
The One Way ANOVA R Module can be found on the publicly available website:
The same R Module is also available in RFC under the “Hypotheses / Empirical Tests” menu item.
126.2.2 Data & Parameters
This R module contains the following fields:
Data X: a multivariate dataset containing quantitative data
Names of X columns: a space delimited list of names (one name for each column)
Response Variable: a positive integer value of the column in the multivariate dataset which corresponds to the response/endogenous variable (i.e. the variable we wish to explain or predict)
Factor Variable: a positive integer value of the column in the multivariate dataset which corresponds to the explanatory variable (i.e. a qualitative variable containing the single-quoted group labels)
Include Intercept Term. This parameter can be set to the following values:
FALSE
TRUE
126.2.3 Output
Consider the problem of measuring the effect of three therapies (“Family Therapy”, “Cognitive Behavior Therapy”, and “Control”) on the post-therapy weight of anorexia patients within an experimental (medical) setting. We do not only wish to determine whether there is a treatment effect or not: if there is a significant effect we also need to know which treatment is best.
The results from the One Way ANOVA analysis are shown below.
The “coefficients” show the effects of each treatment group. The first number (\(\bar{x}_1 = 85.697\) pounds) is the mean of the CBT (this is used as a “baseline”). The second and third number (\(\bar{x}_2 - \bar{x}_1 = -4.589\) and \(\bar{x}_3 - \bar{x}_1 = 4.798\)) show the effects of the Control and FT treatments. Note that the order in which these groups are listed is alphabetical. Hence, it would have been better to name the group of reference (in this case it is the placebo group) in such a way that it precedes to other groups alphabetically (e.g. A instead of Control).
In any case, it seems like Treatment FT has the most beneficial effect because the post-treatment weight is (on average) 4.798 pounds higher than for CBT.
The Analysis of Variance Table is used to assess the Null and Alternative Hypothesis and is based on the ratio of two Variances (i.e. the “explained” Variance divided by the “unexplained” Variance). Since the ratio of two Variances follows an F-Distribution, we need to use an F-Test (F value \(= 459.49/53.12 = 8.6506\)). The corresponding p-value is \(p \simeq 0.0004443\) which is certainly small enough for most researchers to reject the Null Hypothesis. We conclude that the hypothesis \(\mu_1 = \mu_2 = \mu_3\) must be rejected which means that there is a significant treatment effect.
For reporting, include an ANOVA effect size such as:
\[
\eta^2 = \frac{SS_{between}}{SS_{total}}.
\]
Even though the Null Hypothesis was rejected, we still need to examine the table containing the so-called “Tukey’s Honestly Significant Differences” for each pair of treatments. The difference between Control and CBT, for instance, is -4.588859 which implies that \(\bar{x}_2 < \bar{x}_1\). The corresponding 95% confidence interval is [-9.303772, 0.1260527] which is large enough to contain zero. We conclude that the difference between Control and CBT is not significantly different from zero. The p-value (\(\simeq 0.0581141\)) is also too high to allow us to reject H\(_0\). Only the difference between FT and Control is significantly different from zero. This alone does not establish that FT is the best treatment overall; it only shows a significant difference for that pair.
Just as was the case for the Unpaired Two Sample t-Test, the One Way ANOVA test makes the assumption of equal Variances for each group. This can be assessed by the diagnostic Hypothesis Test called “Levene’s Test for Homogeneity of Variance” (Levene 1960) which is shown near the bottom of the output. The results show that the Null Hypothesis (i.e. Homogeneity of Variance) cannot be rejected. Hence, the underlying assumption of the One Way ANOVA test is satisfied.
To compute the One Way Analysis of Variance (1-way ANOVA) on your local machine, the following script can be used in the R console.
Note: this script reproduces the chapter example with anorexia data (same dataset as the embedded app).
library(car)library(MASS)x <- anorexiapar3 =TRUE# include constant termxdf <-na.omit(data.frame(Response = x$Postwt, Treatment =as.factor(x$Treat)))myformula <- Response ~ Treatmentmyformulam1 <- Response ~ Treatment -1if(par3 ==FALSE) (lmxdf <-lm(myformulam1, data = xdf) ) else(lmxdf <-lm(myformula, data = xdf) )(aov.xdf<-aov(lmxdf) )(anova.xdf<-anova(lmxdf) )if(par3==TRUE){ thsd<-TukeyHSD(aov.xdf)print(thsd)} else {print('Must Include Intercept to use Tukey Test')}(lt.lmxdf<-leveneTest(lmxdf))
Call:
lm(formula = myformula, data = xdf)
Coefficients:
(Intercept) TreatmentCont TreatmentFT
85.697 -4.589 4.798
Call:
aov(formula = lmxdf)
Terms:
Treatment Residuals
Sum of Squares 918.987 3665.058
Deg. of Freedom 2 69
Residual standard error: 7.288126
Estimated effects may be unbalanced
Analysis of Variance Table
Response: Response
Df Sum Sq Mean Sq F value Pr(>F)
Treatment 2 919.0 459.49 8.6506 0.0004443 ***
Residuals 69 3665.1 53.12
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = lmxdf)
$Treatment
diff lwr upr p adj
Cont-CBT -4.588859 -9.303772 0.1260527 0.0581141
FT-CBT 4.797566 -0.534965 10.1300969 0.0864030
FT-Cont 9.386425 3.941386 14.8314647 0.0002930
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 2 1.7671 0.1785
69
126.3 Assumptions
The One Way ANOVA test makes the following assumptions:
The residuals are (approximately) normally distributed.
The samples are independent.
The Variances of the populations are equal.
The responses for each group are independent.
126.4 Alternatives
In theory it is possible to use the alternatives of the Unpaired Two Sample t-Test which is applied to all pairwise combinations of groups. Of course, one should beware of the problems that are associated with applying multiple Hypothesis Tests which each induces a type I error.
The Kruskal-Wallis test (Chapter 127) is a non-parametric alternative to the One Way ANOVA. It is based on ranks rather than raw values and does not require the assumptions of normality or equal variances. The R module that is available through the menu “Hypotheses / Multivariate (pair-wise) Testing” (use the Boxplot tab) also features various types of One Way ANOVA methods (including the Kruskal-Wallis approach). An example can be found here (click on violin plot (between variance)).
Levene, Howard. 1960. “Robust Tests for Equality of Variances.” In Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling, edited by Ingram Olkin, S. G. Ghurye, Wassily Hoeffding, William G. Madow, and Henry B. Mann, 278–92. Stanford, CA: Stanford University Press.
Tukey, John W. 1949. “Comparing Individual Means in the Analysis of Variance.”Biometrics 5 (2): 99–114. https://doi.org/10.2307/3001913.