15 Negative Binomial Distribution

15.1 Definition

Let \(X\) be the number of failures before the \(r\)-th success in independent Bernoulli trials with success probability \(p\). Then \(X\) follows a negative binomial distribution:

\[ X \sim \text{NegBin}(r,p), \quad r \in \mathbb{N},\; p \in (0,1),\; X \in \{0,1,2,\dots\} \]

with probability mass function

\[ \text{P}(X = k) = \binom{k+r-1}{k}(1-p)^k p^r, \quad k = 0,1,2,\dots \]

and cumulative distribution function

\[ \text{P}(X \le k) = \sum_{i=0}^{k} \binom{i+r-1}{i}(1-p)^i p^r \]

This chapter uses the same parameterization as R’s dnbinom and pnbinom (failures before \(r\) successes).

Setting \(r=1\) recovers the geometric PMF:

\[ \text{P}(X=k)=\binom{k+1-1}{k}(1-p)^k p^1=(1-p)^k p. \]

15.2 Mean

\[ \text{E}(X) = \frac{r(1-p)}{p} \]

15.3 Variance

\[ \text{V}(X) = \frac{r(1-p)}{p^2} \]

Since \(0<p<1\), we have \(\frac{1}{p}>1\), so

\[ \text{V}(X)=\frac{\text{E}(X)}{p}>\text{E}(X), \]

which explains why the negative binomial naturally supports overdispersed count data relative to Poisson.

15.4 Mode

\[ \text{Mo}(X)= \begin{cases} 0, & r=1,\\ \left\lfloor\frac{(r-1)(1-p)}{p}\right\rfloor, & r>1. \end{cases} \]

15.5 Median

There is no simple closed-form expression for the median. In applications, it is usually computed numerically via the CDF.

15.6 Coefficient of Skewness

\[ g_1 = \frac{2-p}{\sqrt{r(1-p)}} \]

15.7 Coefficient of Kurtosis

\[ g_2 = 3 + \frac{6}{r} + \frac{p^2}{r(1-p)} \]

The corresponding excess kurtosis is \(\frac{6}{r}+\frac{p^2}{r(1-p)}\).

15.8 Moment Generating Function

\[ M_X(t)=\left(\frac{p}{1-(1-p)e^t}\right)^r, \quad t<-\ln(1-p) \]

15.9 Gamma-Poisson Mixture Link

The negative binomial can be derived as a Poisson-gamma mixture:

\[ X \mid \Lambda=\lambda \sim \text{Pois}(\lambda), \qquad \Lambda \sim \text{Gamma}\!\left(r,\ \text{rate}=\frac{p}{1-p}\right). \]

Integrating out \(\Lambda\) yields

\[ X \sim \text{NegBin}(r,p). \]

This mechanism explains overdispersion: the latent rate variation (gamma mixing) inflates marginal variance beyond the marginal mean.

15.10 Purpose

The negative binomial model is useful when events are counted until a target number of successes is reached:

Campaign execution: number of failed contacts before achieving a fixed number of conversions.
Quality assurance: number of nonconforming units observed before a target number of conforming outcomes.
Overdispersed count modeling: compared with Poisson (Chapter 18), the negative binomial can accommodate larger variance relative to the mean.
Generalization of geometric: when \(r=1\), the negative binomial reduces to the geometric distribution (Chapter 14).

15.11 R Module

The Negative Binomial Probabilities app is available in the handbook menu:

Distributions / Negative Binomial Probabilities

It is also accessible directly at:

https://shiny.wessa.net/negativebinomial/

15.12 Business Example: Conversion Pipeline Completion Risk

A sales team needs \(r = 6\) signed contracts to complete a quarterly target tranche. For each qualified lead, the estimated close probability is \(p = 0.25\). Let \(X\) denote the number of failed leads before reaching 6 signed contracts.

A useful planning quantity is:

\[ \text{P}(X \le 12) \]

which is the probability of reaching the target with at most 12 failed leads.

r <- 6
p <- 0.25

cat("P(X <= 12) =", pnbinom(12, size = r, prob = p), "\n")
cat("P(X >= 20) =", 1 - pnbinom(19, size = r, prob = p), "\n")

P(X <= 12) = 0.2825492 
P(X >= 20) = 0.3782785

You can reproduce this setup with the preconfigured app below:

Interactive Shiny app (click to load).

Open in new tab

15.13 Additional Academic Example: Seed Germination Screening

In a plant-science pilot, researchers monitor germination attempts until they observe \(r=4\) successful germinations.
If each attempt succeeds with probability \(p=0.35\), let \(X\) be the number of failed attempts before the fourth success.

Two useful planning probabilities are:

\[ \text{P}(X \le 6) \quad \text{and} \quad \text{P}(X \ge 12). \]

r_germ <- 4
p_germ <- 0.35

cat("P(X <= 6) =", pnbinom(6, size = r_germ, prob = p_germ), "\n")
cat("P(X >= 12) =", 1 - pnbinom(11, size = r_germ, prob = p_germ), "\n")

P(X <= 6) = 0.486173 
P(X >= 12) = 0.1726965