18 Poisson Distribution

Poisson modeling answers a common applied question: “how many events occur in a fixed time or space window when events arrive randomly at an average rate?” Examples include call arrivals, defects, accidents, and mutation counts.

Formally, the random variate \(X\) defined for non-negative integers \(X \in \{0, 1, 2, 3, ...\}\) is said to have a Poisson Distribution (i.e. \(X \sim \text{Pois}(\lambda)\)) with rate parameter \(\lambda > 0\).

18.1 Probability Mass Function

\[ \text{P}(X = k) = \frac{\lambda^k e^{-\lambda}}{k!} \]

for \(k = 0, 1, 2, 3, ...\) and \(\lambda > 0\).

18.2 Purpose

The Poisson distribution is used to model event counts in fixed intervals when:

Events occur independently of each other
The average rate of occurrence is constant
Two events cannot occur at exactly the same instant

Common applications include:

Number of customers arriving at a service point
Number of defects in a manufactured product
Number of accidents at an intersection
Number of mutations in a DNA sequence
Number of goals scored in a sports match
Number of phone calls received by a call center

The Poisson distribution is also used in Poisson regression (log-linear models) for count data, as an alternative to linear regression when the response variable represents counts. This chapter treats Poisson in more depth because it links directly to multiple approximation results and related-distribution bridges used elsewhere in the handbook.

The figure below shows examples of the Poisson Probability Mass Function for different values of \(\lambda\).

Code

par(mfrow = c(2, 2))
x <- 0:20

# Lambda = 1
plot(x, dpois(x, lambda = 1), type = "h", lwd = 2, col = "blue",
     xlab = "k", ylab = "P(X = k)", main = expression(lambda == 1))
points(x, dpois(x, lambda = 1), pch = 19, col = "blue")

# Lambda = 4
plot(x, dpois(x, lambda = 4), type = "h", lwd = 2, col = "blue",
     xlab = "k", ylab = "P(X = k)", main = expression(lambda == 4))
points(x, dpois(x, lambda = 4), pch = 19, col = "blue")

# Lambda = 10
plot(x, dpois(x, lambda = 10), type = "h", lwd = 2, col = "blue",
     xlab = "k", ylab = "P(X = k)", main = expression(lambda == 10))
points(x, dpois(x, lambda = 10), pch = 19, col = "blue")

# Lambda = 15
x <- 0:30
plot(x, dpois(x, lambda = 15), type = "h", lwd = 2, col = "blue",
     xlab = "k", ylab = "P(X = k)", main = expression(lambda == 15))
points(x, dpois(x, lambda = 15), pch = 19, col = "blue")

par(mfrow = c(1, 1))

Figure 18.1: Poisson Probability Mass Function for various values of lambda

18.3 Distribution Function

\[ F(k)=\text{P}(X \leq k)=e^{-\lambda} \sum_{i=0}^{k} \frac{\lambda^i}{i!} \]

The figure below shows the Poisson Distribution Function for \(\lambda = 4\).

Code

x <- 0:15
plot(x, ppois(x, lambda = 4), type = "s", lwd = 2, col = "blue",
     xlab = "k", ylab = "F(k)", main = "Poisson Distribution Function",
     sub = expression(lambda == 4))
points(x, ppois(x, lambda = 4), pch = 19, col = "blue")

Figure 18.2: Poisson Distribution Function (lambda = 4)

18.4 Moment Generating Function

\[ M_X(t) = e^{\lambda(e^t - 1)} \]

18.5 1st Uncentered Moment

\[ \mu_1' = \lambda \]

18.6 2nd Uncentered Moment

\[ \mu_2' = \lambda + \lambda^2 \]

18.7 3rd Uncentered Moment

\[ \mu_3' = \lambda + 3\lambda^2 + \lambda^3 \]

18.8 4th Uncentered Moment

\[ \mu_4' = \lambda + 7\lambda^2 + 6\lambda^3 + \lambda^4 \]

18.9 2nd Centered Moment

\[ \mu_2 = \lambda \]

18.10 3rd Centered Moment

\[ \mu_3 = \lambda \]

18.11 4th Centered Moment

\[ \mu_4 = \lambda + 3\lambda^2 \]

18.12 Expected Value

\[ \text{E}(X) = \lambda \]

18.13 Variance

\[ \text{V}(X) = \lambda \]

The equality of mean and variance is a defining characteristic of the Poisson distribution. If the sample variance substantially exceeds the sample mean (overdispersion) or is substantially smaller (underdispersion), the Poisson model may not be appropriate.

18.14 Median

There is no closed-form expression for the median of a Poisson distribution. It can be approximated by

\[ \text{Med}(X) \approx \lfloor \lambda + 1/3 - 0.02/\lambda \rfloor \]

where \(\lfloor \cdot \rfloor\) denotes the floor function.

18.15 Mode

\[ \text{Mo}(X) = \lfloor \lambda \rfloor \]

When \(\lambda\) is a positive integer, both \(\lambda\) and \(\lambda - 1\) are modes.

18.16 Coefficient of Skewness

\[ g_1 = \frac{1}{\sqrt{\lambda}} \]

The distribution is always right-skewed, but the skewness decreases as \(\lambda\) increases.

18.17 Coefficient of Kurtosis

\[ g_2 = 3 + \frac{1}{\lambda} \]

The excess kurtosis is \(1/\lambda\), which approaches zero as \(\lambda\) increases.

18.18 Parameter Estimation

The maximum likelihood estimator of \(\lambda\) is the sample mean:

\[ \hat{\lambda} = \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i \]

This estimator is unbiased and achieves the Cramér-Rao lower bound (Cramér 1946; Rao 1945).

18.19 R Module

18.19.1 Public website

The Poisson Distribution is available on the public website:

https://compute.wessa.net/rwasp_poisson.wasp

18.19.2 RFC

The Poisson Distribution module is available in RFC under the menu “Distributions / Poisson Probabilities”.

18.19.3 Direct app link

https://shiny.wessa.net/poisson/

18.19.4 R Code

The following code demonstrates Poisson probability calculations:

# Probability mass function: P(X = k)
dpois(x = 3, lambda = 4)

# Distribution function: P(X <= k)
ppois(q = 3, lambda = 4)

# Quantile function: find k such that P(X <= k) >= p
qpois(p = 0.5, lambda = 4)

# Generate random Poisson numbers
set.seed(42)
rpois(n = 10, lambda = 4)

[1] 0.1953668
[1] 0.4334701
[1] 4
 [1] 7 7 3 6 5 4 5 2 5 5

To fit a Poisson distribution to observed data:

library(MASS)

# Example: count data
counts <- c(2, 1, 3, 4, 1, 0, 2, 5, 3, 2, 1, 4, 2, 3, 1, 2, 0, 3, 4, 2)

# Maximum likelihood estimation
fit <- fitdistr(counts, "Poisson")
print(fit)

# Compare sample mean with ML estimate
cat("\nSample mean:", mean(counts), "\n")
cat("Sample variance:", var(counts), "\n")
cat("Variance/Mean ratio:", var(counts)/mean(counts), "\n")

    lambda  
  2.2500000 
 (0.3354102)

Sample mean: 2.25 
Sample variance: 1.881579 
Variance/Mean ratio: 0.8362573

18.20 Example

A call center receives an average of 4.5 calls per minute. Assuming calls arrive according to a Poisson process, we can calculate various probabilities:

lambda <- 4.5

# P(X = 0): probability of no calls in a minute
cat("P(no calls):", dpois(0, lambda), "\n")

# P(X >= 7): probability of 7 or more calls
cat("P(7 or more calls):", 1 - ppois(6, lambda), "\n")

# P(3 <= X <= 6): probability of 3 to 6 calls
cat("P(3 to 6 calls):", ppois(6, lambda) - ppois(2, lambda), "\n")

# Expected number of calls in 5 minutes
cat("Expected calls in 5 minutes:", 5 * lambda, "\n")

P(no calls): 0.011109 
P(7 or more calls): 0.1689494 
P(3 to 6 calls): 0.6574725 
Expected calls in 5 minutes: 22.5

Code

x <- 0:15
probs <- dpois(x, lambda = 4.5)
barplot(probs, names.arg = x, col = "steelblue",
        xlab = "Number of calls", ylab = "Probability",
        main = "Poisson Distribution (lambda = 4.5)")

Figure 18.3: Poisson distribution for call center example (lambda = 4.5)

18.21 Additional Business Example: Cybersecurity Alert Escalation

A security operations center (SOC) receives an average of \(\lambda = 6.2\) high-priority alerts per hour.
Operations policy requires immediate escalation to an incident commander when 10 or more high-priority alerts arrive in one hour.

The escalation probability is:

lambda_soc <- 6.2
threshold <- 10

p_escalate <- 1 - ppois(threshold - 1, lambda = lambda_soc)
cat("P(X >= 10) =", p_escalate, "\n")

P(X >= 10) = 0.09837934

You can reproduce this exact scenario with the preconfigured Poisson app:

Interactive Shiny app (click to load).

Open in new tab

Interpretation:

This probability quantifies how often the SOC should expect to trigger emergency escalation under current conditions.
If the observed escalation frequency is much higher than this benchmark, it may indicate a changed threat regime (mean rate shift) or alert-quality issues.
This can be integrated into staffing and on-call capacity planning by multiplying the hourly escalation probability by the number of monitored hours per week.

18.22 Random Number Generator

Random numbers from a Poisson distribution can be generated using the rpois function:

set.seed(123)
n <- 1000
lambda <- 5

# Generate random numbers
x <- rpois(n, lambda)

# Compare theoretical and empirical statistics
cat("Theoretical mean:", lambda, "\n")
cat("Sample mean:", mean(x), "\n")
cat("Theoretical variance:", lambda, "\n")
cat("Sample variance:", var(x), "\n")

Theoretical mean: 5 
Sample mean: 4.981 
Theoretical variance: 5 
Sample variance: 4.849488

Code

hist(x, breaks = seq(-0.5, max(x) + 0.5, by = 1), col = "steelblue",
     xlab = "Value", main = "Poisson Random Numbers (n = 1000, lambda = 5)",
     freq = FALSE)

# Overlay theoretical probabilities
points(0:max(x), dpois(0:max(x), lambda), pch = 19, col = "red", cex = 1.2)
legend("topright", legend = "Theoretical", pch = 19, col = "red")

Figure 18.4: Histogram of simulated Poisson random numbers

18.23 Property 1: Additivity of Independent Poisson Variables

The sum of independent Poisson random variables is also Poisson distributed:

\[ \text{If } X_1 \sim \text{Pois}(\lambda_1) \text{ and } X_2 \sim \text{Pois}(\lambda_2) \text{ are independent, then } X_1 + X_2 \sim \text{Pois}(\lambda_1 + \lambda_2) \]

18.24 Property 2: Normal Approximation for Large Rates

For large \(\lambda\), the Poisson distribution can be approximated by a Normal distribution:

\[ \text{Pois}(\lambda) \approx \text{N}(\lambda, \lambda) \quad \text{for large } \lambda \]

A common rule of thumb is that this approximation is adequate when \(\lambda \geq 20\) (Ross 2014; DeGroot and Schervish 2012).

18.25 Property 3: Poisson Process Counting Property

If events occur at a constant rate \(\lambda\) per unit time and independently of each other, the number of events in a fixed time interval follows a Poisson distribution. This is known as a Poisson process.

18.26 Property 4: Exponential Interarrival Times

In a Poisson process with rate \(\lambda\), the time between consecutive events follows an Exponential distribution with parameter \(\lambda\).

18.1 Probability Mass Function

18.2 Purpose

18.3 Distribution Function

18.4 Moment Generating Function

18.5 1st Uncentered Moment

18.6 2nd Uncentered Moment

18.7 3rd Uncentered Moment

18.8 4th Uncentered Moment

18.9 2nd Centered Moment

18.10 3rd Centered Moment

18.11 4th Centered Moment

18.12 Expected Value

18.13 Variance

18.14 Median

18.15 Mode

18.16 Coefficient of Skewness

18.17 Coefficient of Kurtosis

18.18 Parameter Estimation

18.19 R Module

18.19.1 Public website

18.19.2 RFC

18.19.3 Direct app link

18.19.4 R Code

18.20 Example

18.21 Additional Business Example: Cybersecurity Alert Escalation

18.22 Random Number Generator

18.23 Property 1: Additivity of Independent Poisson Variables

18.24 Property 2: Normal Approximation for Large Rates

18.25 Property 3: Poisson Process Counting Property

18.26 Property 4: Exponential Interarrival Times

18.27 Related Distributions 1: Binomial-to-Poisson Limit

18.28 Related Distributions 2: Gamma Waiting-Time Relation

18.29 Related Distributions 3: Conditional Uniform Arrival Times