103Statistical Test of the Population Mean with known Variance
103.1 Theory
103.1.1 Statistical Hypothesis: Testing the Mean with known Variance - Population
The population distribution of the random variable \(X\) is written as \(X \sim \text{N} \left( \mu, \sigma^2 \right)\) where \(\mu\) and \(\sigma^2\) represent the mean and variance of the normal distribution. In this representation it is assumed that \(\mu\) is unknown and \(\sigma^2\) is known.
By introducing the (unrealistic) assumption of a known variance, the mathematical complexity of the statistical hypothesis test is reduced. Hence, this analysis may serve as an initial introduction into hypothesis testing about the mean.
103.1.2 Statistical Hypothesis: Testing the Mean with known Variance - Sample
The statistic for the sample mean is \(\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i\) where \(n\) is the number of observations in the sample. The sample distribution for the mean can be written in terms of the hypotheses \(\text{H}_0\) or \(\text{H}_A\):
Observe how the sample mean has a variance which is much smaller than the variance of the original distribution. Under the i.i.d. normality assumption, this is an exact result since \(\mathrm{Var}(\bar{x}) = \sigma^2/n\) (the Central Limit Theorem is not needed here).
103.1.3 Statistical Hypothesis: Testing the Mean with known Variance - Critical Region
Table 103.2 shows four situations that may occur when applying statistical hypothesis testing. If the Null Hypothesis H\(_0\) is not rejected by the scientist while it is actually true (first column, first row) then the decision of the hypothesis test is correct. We define the probability of this occurrence as \(p = 1 - \alpha\). If the Null Hypothesis H\(_0\) is rejected (while it is, in fact, true) then the scientist’s conclusion is wrong (first column, second row). The probability of this error is defined as \(p = \alpha\) (also called the “type I error”).
If the Null Hypothesis H\(_0\) is not rejected while the Alternative Hypothesis H\(_A\) is true (second column, first row) then a so-called “type II error” is made. The probability of this occurrence is defined as \(p=\beta\). Finally, if the Null Hypothesis H\(_0\) is rejected (i.e. H\(_A\) is supported) while H\(_A\) is actually true (second column, second row) then the scientist’s conclusion is correct. The probability of this happening is \(p = 1 - \beta\).
Table 103.2: Type I and II Errors
Decision
H\(_0\) is true
H\(_A\) is true
Fail to Reject H\(_0\)
Correct (probability \(p = 1 - \alpha\))
Type II Error (probability \(p = \beta\))
Reject H\(_0\)
Type I Error (probability \(p = \alpha\))
Correct (probability \(p = 1 - \beta\))
Table 103.3: Integrals of Type I and II Errors
Decision
H\(_0: \mu \leq \mu_0\)
H\(_A: \mu > \mu_0\)
Fail to Reject H\(_0\)
\(P\left(\bar{x} \leq c \mid H_0: \mu \leq \mu_0\right)=\int_{-\infty}^{c} f(\bar{x})\,d\bar{x}=1-\alpha\), with \(f(\bar{x})\sim N\left(\mu_0,\frac{\sigma^2}{n}\right)\)
\(P\left(\bar{x} \leq c \mid H_A: \mu > \mu_0\right)=\int_{-\infty}^{c} f(\bar{x})\,d\bar{x}=\beta\), with \(f(\bar{x})\sim N\left(\mu_A,\frac{\sigma^2}{n}\right)\)
Reject H\(_0\)
\(P\left(\bar{x} > c \mid H_0: \mu \leq \mu_0\right)=\int_{c}^{\infty} f(\bar{x})\,d\bar{x}=\alpha\), with \(f(\bar{x})\sim N\left(\mu_0,\frac{\sigma^2}{n}\right)\)
\(P\left(\bar{x} > c \mid H_A: \mu > \mu_0\right)=\int_{c}^{\infty} f(\bar{x})\,d\bar{x}=1-\beta\), with \(f(\bar{x})\sim N\left(\mu_A,\frac{\sigma^2}{n}\right)\)
For composite hypotheses such as H\(_0: \mu \leq \mu_0\) versus H\(_A: \mu > \mu_0\), the Type I error rate \(\alpha\) is controlled at the boundary case \(\mu = \mu_0\), whereas the Type II error \(\beta\) is computed for a specific alternative value \(\mu_A\).
To compute this Hypothesis Test on your local machine, the following script can be used in the R console:
par1 =0.36#Population Variancepar2 =5.2#Null Hypothesis about meanpar3 =5.4#Alternative Hypothesis about meanpar4 =0.05#Type I error (alpha)par5 =0.05#Type II error (beta)c <-'NA'csn <-abs(qnorm(par5))if (par2 == par3) { conclusion <-'Error: the null hypothesis and alternative hypothesis must not be equal.'}ua <-abs(qnorm(par4))ub <-qnorm(par5)c <- (par2+ua/ub*(-par3))/(1-(ua/ub))sqrtn <- ua*sqrt(par1)/(c - par2)samplesize <- sqrtn * sqrtnprint(ua)print(ub)print(c)print(sqrtn)print(samplesize)
The two-sided 95% confidence interval for \(\mu\) is \([4.958, 5.742]\). The right one-sided 95% confidence interval is \([-\infty, 5.679]\) whereas the left one-sided interval is \([5.021, \infty)\). In repeated sampling, these procedures achieve 95% coverage.
The two-sided acceptance region can be written as P\(\left( 4.808 \leq \bar{x} \leq 5.592 \right) = 0.95\). The right one-sided acceptance region is P\(\left( -\infty \leq \bar{x} \leq 5.529 \right) = 0.95\) whereas the left one-sided acceptance region can be written as P\(\left( 4.871 \leq \bar{x} \leq \infty \right) = 0.95\).