107 Statistical Test of the Standard Deviation \(\sigma\)
107.1 Theory
107.1.1 Statistical Hypothesis: Testing Population Standard Deviation - Population
The population distribution of the random variable \(X\) is written as \(X \sim \text{N} \left( \mu, \sigma^2 \right)\) where \(\mu\) and \(\sigma^2\) represent the mean and variance of the normal distribution. In this representation it is assumed that \(\sigma^2\) is unknown. The parameter \(\mu\) can be either known or unknown.
107.1.2 Statistical Hypothesis: Testing the Standard Deviation - Sample
The statistic for the sample mean is \(\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i\) where \(n\) is the number of observations in the sample. The sample statistic for the variance can be written in terms of \(\mu\) (if this population parameter is known) or in terms of the sample mean \(\bar{x}\):
\[ \begin{align}s^2 = \frac{\sum_{i=1}^{n}\left( x_i - \mu \right)^2}{n} \\s^2 = \frac{\sum_{i=1}^{n}\left( x_i - \bar{x} \right)^2}{n-1} \end{align} \]
For large samples, the distribution of the sample Standard Deviation can be approximated in terms of \(\mu\) (if this population parameter is known) or in terms of the sample mean \(\bar{x}\):
\[ \begin{aligned} & s \approx \text{N} \left( \sigma, \frac{\sigma^2}{2n} \right) & \text{ and } & \frac{s-\sigma}{\frac{\sigma}{\sqrt{2n}}} \approx \text{N}\left( 0, 1 \right) \\ & s \approx \text{N} \left( \sigma, \frac{\sigma^2}{2(n-1)} \right) & \text{ and } & \frac{s-\sigma}{\frac{\sigma}{\sqrt{2(n-1)}}} \approx \text{N}\left( 0, 1 \right) \end{aligned} \]
For finite samples, exact inference is based on the \(\chi^2\) distribution of \(s^2\) (see the derivation below).
107.1.3 Statistical Hypothesis: Testing the Standard Deviation - Critical Region
| Null Hypothesis | Alternative Hypothesis | Critical Region |
|---|---|---|
| \(\sigma \leq \sigma_0\) | \(\sigma > \sigma_0\) | \(\begin{cases} s \geq \sigma_0 + z_{\alpha} \times \frac{\sigma_0}{\sqrt{2 n}} \\ s \geq \sigma_0 + z_{\alpha} \times \frac{\sigma_0}{\sqrt{2(n-1)}} \end{cases}\) |
| \(\sigma \geq \sigma_0\) | \(\sigma < \sigma_0\) | \(\begin{cases} s \leq \sigma_0 - z_{\alpha} \times \frac{\sigma_0}{\sqrt{2 n}} \\ s \leq \sigma_0 - z_{\alpha} \times \frac{\sigma_0}{\sqrt{2(n-1)}} \end{cases}\) |
| \(\sigma = \sigma_0\) | \(\sigma \neq \sigma_0\) | \(\begin{cases} \begin{cases} s \geq \sigma_0 + z_{\frac{\alpha}{2}} \times \frac{\sigma_0}{\sqrt{2 n}} \\ s \geq \sigma_0 + z_{\frac{\alpha}{2}} \times \frac{\sigma_0}{\sqrt{2(n-1)}} \end{cases} \\ \begin{cases} s \leq \sigma_0 - z_{\frac{\alpha}{2}} \times \frac{\sigma_0}{\sqrt{2 n}} \\ s \leq \sigma_0 - z_{\frac{\alpha}{2}} \times \frac{\sigma_0}{\sqrt{2(n-1)}} \end{cases} \end{cases}\) |
107.1.4 Distribution of Sample Standard Deviation
It can be shown that, for large samples (with \(n > 30\)), the square root of two times a \(\chi^2\)-distributed variable with \(\nu\) degrees of freedom can be approximated by a normal distribution, i.e.
\[ \sqrt{2 \chi_\nu^2} \sim \text{N} \left( \sqrt{2\nu-1}, 1 \right) \]
which leads to a Standard Normal Distribution, or
\[ \left( \sqrt{2 \chi_\nu^2} - \sqrt{2\nu-1} \right) \sim \text{N}(0,1) \]
Since
\[ \frac{(n-1) s^2}{\sigma^2} \sim \chi_{n-1}^2 \]
where
\[ s^2 = \frac{1}{n-1} \sum_{i=1}^{n} \left( x_i - \bar{x} \right)^2 \]
it follows from this approximation that
\[ \sqrt{2 \times \frac{(n-1)s^2}{\sigma^2}} = \frac{s}{\sigma} \sqrt{2(n-1)} \sim \text{N} \left( \sqrt{2n-3}, 1 \right) \]
From this last expression the expected value and the variance of the sample Standard Deviation can be derived as
\[ \begin{aligned}\text{E} \left( \frac{s}{\sigma} \sqrt{2(n-1)} \right) = & \text{E}(s) \times \frac{\sqrt{2(n-1)}}{\sigma} = \sqrt{2n-3} \\& \text{E}(s) = \frac{\sqrt{2n-3}}{\sqrt{2(n-1)}} \times \sigma \\& \text{E}(s) \simeq \sigma\end{aligned} \]
and
\[ \begin{aligned}\text{V} \left( \frac{s}{\sigma} \sqrt{2(n-1)} \right) = &\text{V}(s) \frac{2(n-1)}{\sigma^2} = 1\\&\text{V}(s) = \frac{\sigma^2}{2(n-1)}\end{aligned} \]
The last results can be summarized approximately as
\[ s \approx \text{N}\left( \sigma, \frac{\sigma^2}{2(n-1)} \right) \]
107.2 Examples
107.2.1 Statistical Hypothesis: Testing Standard Deviation -- Example 1: Critical Value (Region)
107.2.1.1 Problem
Find the value \(c\) if the following information is available:
- Population Variance \(\sigma^2\): unknown
- Population Mean \(\mu\): unknown
- Sample Size \(n = 51\)
- Sample Variance \(s^2 = 0.64\)
- Sample Standard Deviation \(s = 0.80\)
- Null Hypothesis for \(\sigma\): \(\sigma_0 = 0.60\)
- Type I Error \(\alpha = 0.05\)
107.2.1.2 Solution
Since H\(_0: \sigma \leq \sigma_0\) and H\(_A: \sigma > \sigma_0\) we search for \(c\) in P\((s \geq c) = \alpha = 0.05\).
\[ \begin{aligned}\text{P} \left( \frac{s - \sigma}{\frac{\sigma}{\sqrt{2(n-1)}}} \geq \frac{c - \sigma}{\frac{\sigma}{\sqrt{2(n-1)}}} \right) &= 0.05 \\\frac{c- \sigma}{\frac{\sigma}{\sqrt{2(n-1)}}} &= 1.645 \\c &= \sigma + 1.645 \times \frac{\sigma}{\sqrt{2(n-1)}} \\c &= 0.60 + \left( 1.645 \times\frac{0.60}{10} \right) \\c &= 0.60 + 0.0987 = 0.6987\end{aligned} \]
Hence P\((s \geq 0.6987) = 0.05\) which implies that the sample Standard Deviation (=0.80) is larger than the critical value (=0.6987). For this reason we have to reject the Null Hypothesis.
107.2.2 Statistical Hypothesis: Testing Standard Deviation -- Example 2: p-value
107.2.2.1 Problem
Find the p-value if the following information is available:
- Population Variance \(\sigma^2\): unknown
- Population Mean \(\mu\): unknown
- Sample Size \(n = 51\)
- Sample Variance \(s^2 = 0.64\)
- Sample Standard Deviation \(s = 0.80\)
- Null Hypothesis for \(\sigma\): \(\sigma_0 = 0.60\)
- Type I Error \(\alpha = 0.05\)
107.2.2.2 Solution
Since H\(_0: \sigma \leq \sigma_0\) and H\(_A: \sigma > \sigma_0\) we can write
\[ \frac{s - \sigma}{\frac{\sigma}{\sqrt{2(n-1)}}} = \frac{0.80 - 0.60}{\frac{0.60}{\sqrt{2(51-1)}}} = \frac{0.20}{0.60} \times 10 = 3.3333 \]
which leads to P\((z \geq 3.3333) = 0.000429\).
Since the probability 0.000429 is smaller than \(\alpha = 0.05\) we have to reject the Null Hypothesis.