10 Law of Large Numbers
10.1 Golden Theorem
Jacob Bernoulli developed the first version of the Law of Large Numbers (the so-called Golden Theorem or Bernoulli’s Theorem) which was published in Ars Conjectandi (1713). The Theorem was later refined by several well-known statisticians and written in more general terms. Today, it is often described in two prominent forms: the “Weak” and the “Strong” Law of Large Numbers.
10.2 Weak Law of Large Numbers
Let \(X_1, X_2, X_3, …\) be an infinite sequence of independent and identically distributed random variables for which the expected values exist and are equal, i.e. E\((X_1) =\) E\((X_2) =\) E\((X_3) = … = \mu \in \mathbb{R}\). Then the average of a subset of said variables
\[ \bar{X}_n = \frac{1}{n} \left( X_1 + X_2 + X_3 + … + X_n \right) \]
converges to \(\mu\) with increasing \(n\). In other words, with \(n\) sufficiently large there will be a very high probability that \(\bar{X}_n\) will be close (within a margin of error) to the expected value
\[ \lim\limits_{n\rightarrow\infty} \text{P} \left( \left| \bar{X}_n - \mu \right| > \epsilon \right) = 0 \]
where \(\epsilon > 0\) represents a small but positive number.
10.2.1 Intuition (coin flips)
If we repeatedly flip a fair coin and code heads as 1 and tails as 0, then \(\mu = 0.5\). The Weak Law says that the sample average (the observed proportion of heads) gets closer to 0.5 as the number of flips grows. A practical simulation of this convergence is provided in Tasks 5–8 of the problems chapter (see Section 11.2.1).
10.3 Strong Law of Large Numbers
The strong form of the Law of Large Numbers states that the previous average strongly converges to \(\mu\), i.e.
\[ \text{P} \left( \lim\limits_{n\rightarrow\infty} \bar{X}_n = \mu \right) = 1 \]
which implies that when \(n\) goes to infinity, the probability that the average is equal to the expected value will be equal to one.
The strong form of this law automatically implies the weak form. In the i.i.d. setting stated above (with a finite expected value), the Strong Law applies, so strong convergence does hold. Cases where convergence is weak but not strong arise under different assumptions (for example, non-i.i.d. or dependent sequences).