This distribution models the simplest random experiment: exactly one trial with two possible outcomes (success/failure, yes/no, pass/fail).
12.1 Definition
The Bernoulli Distribution is named after the mathematician Jacob Bernoulli (Bernoulli 1713) and describes a binary random variable \(X\) that can only be “success” or “failure.”
\[
\text{P}(X=1)=p, \quad \text{P}(X=0)=q=1-p
\]
where \(p\) = probability of success and \(q\) = probability of failure.
Equivalent PMF form:
\[
\text{P}(X=x) = p^x(1-p)^{1-x}, \quad x \in \{0,1\}
\]
12.2 Distribution Function
\[
F(x)=
\begin{cases}
0, & x < 0\\
q, & 0 \le x < 1\\
1, & x \ge 1
\end{cases}
\]
12.3 Mean
\[
\text{E}(X) = p
\]
The mean value (or “expected” value) of the outcome \(X\) of the Bernoulli experiment is equal to the probability of obtaining a success. This result can be intuitively interpreted as the expected or average outcome of \(X\) when the Bernoulli experiment is repeated \(N\) times: the average of \(X\) becomes \(\frac{pN}{N} = p\).
12.4 Mode
\[
\begin{align*}
\begin{cases}\text{Mo}(X) = 0 &\text{ if } q > p\\\text{Mo}(X) = 0,1 &\text{ if } q = p\\\text{Mo}(X) = 1 &\text{ if } q < p\end{cases}
\end{align*}
\]
When \(p=q=0.5\), both 0 and 1 are modes (bimodal tie).
12.5 Median
\[
\begin{align*}\begin{cases}\text{Med}(X) = 0 &\text{ if } q > p\\\text{Med}(X) \in [0,1] &\text{ if } q = p\\\text{Med}(X) = 1 &\text{ if } q < p\end{cases}\end{align*}
\]
At \(p=q=0.5\), any value in \([0,1]\) is a valid median because both median inequalities are satisfied: \(\text{P}(X \le m)\ge 0.5\) and \(\text{P}(X \ge m)\ge 0.5\).
12.6 Variance
\[
\text{V}(X) = p q
\]
Intuition: variability is highest when outcomes are most uncertain (\(p=q=0.5\)), and it goes to zero as \(p \to 0\) or \(p \to 1\).
12.7 Moment Generating Function
\[
M_X(t)=\text{E}(e^{tX})=q+pe^t
\]
12.8 Coefficient of Skewness
\[
g_1 = \frac{q-p}{\sqrt{pq}}
\]
Interpretation: \(g_1>0\) when \(p<0.5\) (right-skewed toward 1), \(g_1<0\) when \(p>0.5\) (left-skewed toward 0), and \(g_1=0\) when \(p=0.5\).
12.9 Coefficient of Kurtosis
\[
g_2 = \frac{1-3pq}{pq}
\]
The corresponding excess kurtosis is \(\frac{1-6pq}{pq}\).
Interpretation: kurtosis is smallest at \(p=0.5\) (\(g_2=1\)), increases as outcomes become more imbalanced, and diverges as \(p \to 0\) or \(p \to 1\).
12.10 Parameter Estimation
For a sample \(x_1,\dots,x_n\) with \(x_i \in \{0,1\}\), the maximum-likelihood estimator is
\[
\hat p = \bar x = \frac{1}{n}\sum_{i=1}^n x_i.
\]
12.11 Purpose
The Bernoulli model is the building block for many discrete models:
It is the one-trial version of a success/failure experiment.
Repeating Bernoulli trials leads to the Binomial model (see Chapter 13).
It is used in quality-control pass/fail checks, click/no-click events, and yes/no diagnostic outcomes.
12.12 R Module
You can compute Bernoulli probabilities in R with dbinom using size = 1:
p_demo <-0.35cat("PMF values at x = 0 and x = 1:\n")print(dbinom(c(0, 1), size =1, prob = p_demo))cat("\nCDF value P(X <= 0):", pbinom(0, size =1, prob = p_demo), "\n")cat("Random Bernoulli draws (n = 10):\n")set.seed(123)print(rbinom(10, size =1, prob = p_demo))
PMF values at x = 0 and x = 1:
[1] 0.65 0.35
CDF value P(X <= 0): 0.65
Random Bernoulli draws (n = 10):
[1] 0 1 0 1 1 0 0 1 0 0
12.13 Example
Suppose a quality-control test marks one product as either “pass” (=1) or “fail” (=0). If the pass probability is \(p = 0.7\), then:
p <-0.7probs <-c(`0`=1- p, `1`= p)print(probs)barplot(probs, col ="steelblue", ylab ="Probability",main ="Bernoulli probabilities (p = 0.7)")
0 1
0.3 0.7
Bernoulli, Jacob. 1713. Ars Conjectandi. Basel: Thurnisiorum Fratrum.