28 Lognormal Distribution

The Lognormal distribution is the natural model for positive-valued quantities that arise through multiplicative growth or accumulation. Where the Normal distribution describes phenomena built up by addition, the Lognormal describes phenomena built up by multiplication — income, stock prices, environmental concentrations, and biological sizes all tend to follow it.

Formally, the random variate \(X\) defined for the range \(X > 0\), is said to have a Lognormal Distribution (i.e. \(X \sim \text{LnN}(\mu, \sigma^2)\)) if \(\ln(X)\) follows a Normal Distribution with mean \(\mu \in \mathbb{R}\) and variance \(\sigma^2 > 0\).

Parameterization note. The parameters \(\mu\) and \(\sigma\) are the mean and standard deviation of \(\ln(X)\), not of \(X\) itself. The mean and variance of \(X\) are more complex functions of \(\mu\) and \(\sigma\) (see the Expected Value and Variance sections below). In R, these parameters are named meanlog (\(= \mu\)) and sdlog (\(= \sigma\)). For large values of \(\sigma\), the density peak concentrates very close to zero and the distribution becomes heavily right-skewed.

28.1 Probability Density Function

\[ f(x) = \frac{1}{x \sigma \sqrt{2\pi}} \exp\!\left(-\frac{(\ln x - \mu)^2}{2\sigma^2}\right), \quad x > 0 \]

The figure below shows examples of the Lognormal Probability Density Function for different parameter combinations.

Code

par(mfrow = c(2, 2))
x <- seq(0.001, 8, length = 1000)

plot(x, dlnorm(x, meanlog = 0, sdlog = 0.25), type = "l", lwd = 2, col = "blue",
     xlab = "x", ylab = "f(x)", main = expression(paste(mu == 0, ",  ", sigma == 0.25)))

plot(x, dlnorm(x, meanlog = 0, sdlog = 0.5), type = "l", lwd = 2, col = "blue",
     xlab = "x", ylab = "f(x)", main = expression(paste(mu == 0, ",  ", sigma == 0.5)))

plot(x, dlnorm(x, meanlog = 0, sdlog = 1), type = "l", lwd = 2, col = "blue",
     xlab = "x", ylab = "f(x)", main = expression(paste(mu == 0, ",  ", sigma == 1)))

plot(x, dlnorm(x, meanlog = 1, sdlog = 0.5), type = "l", lwd = 2, col = "blue",
     xlab = "x", ylab = "f(x)", main = expression(paste(mu == 1, ",  ", sigma == 0.5)))

par(mfrow = c(1, 1))

Figure 28.1: Lognormal Probability Density Function for various parameter combinations

28.2 Purpose

The Lognormal distribution models positive quantities that arise as the product of many small independent factors. By the multiplicative analogue of the Central Limit Theorem, the logarithm of such a product is approximately Normal, so the product itself is approximately Lognormal. Common applications include:

Household income and individual wealth distributions
Stock prices and financial asset returns (multiplicative period-by-period changes)
Environmental concentrations: pollutant levels, water quality measurements
Biological sizes, growth rates, and incubation times
Latency and response times in computing and telecommunications systems

Relation to the discrete setting. There is no standard discrete counterpart of the Lognormal distribution. Its defining feature — the logarithm of the variable is Normal — has no natural discrete analog. The closest conceptual relatives are the Geometric and Negative Binomial distributions, which arise from multiplicative processes in discrete state spaces, and the Logarithmic series distribution, which has a log-linear tail. The Lognormal stands apart from the other continuous distributions in this chapter because it arises from multiplicative rather than additive accumulation of independent effects.

28.3 Distribution Function

\[ F(x) = \Phi\!\left(\frac{\ln x - \mu}{\sigma}\right), \quad x > 0 \]

where \(\Phi(\cdot)\) denotes the standard Normal distribution function (see Chapter 20).

The figure below shows the Lognormal Distribution Function for \(\mu = 0\) and \(\sigma = 0.5\).

Code

x <- seq(0, 6, length = 500)
plot(x, plnorm(x, meanlog = 0, sdlog = 0.5), type = "l", lwd = 2, col = "blue",
     xlab = "x", ylab = "F(x)", main = "Lognormal Distribution Function",
     sub = expression(paste(mu == 0, ",  ", sigma == 0.5)))

Figure 28.2: Lognormal Distribution Function (mu = 0, sigma = 0.5)

28.4 Moment Generating Function

The moment generating function \(M_X(t) = \text{E}(e^{tX})\) does not exist for any \(t > 0\). Because \(\text{E}(X^n) = \exp(n\mu + n^2\sigma^2/2)\) grows super-exponentially in \(n\), the Taylor series \(\sum_{n=0}^{\infty} t^n \text{E}(X^n)/n!\) diverges for all \(t > 0\). All moments of the Lognormal distribution are nevertheless finite.

28.5 1st Uncentered Moment

\[ \mu_1' = e^{\mu + \sigma^2/2} \]

28.6 2nd Uncentered Moment

\[ \mu_2' = e^{2\mu + 2\sigma^2} \]

28.7 3rd Uncentered Moment

\[ \mu_3' = e^{3\mu + 9\sigma^2/2} \]

28.8 4th Uncentered Moment

\[ \mu_4' = e^{4\mu + 8\sigma^2} \]

The general formula is \(\mu_n' = \text{E}(X^n) = \exp(n\mu + n^2\sigma^2/2)\).

28.9 2nd Centered Moment

\[ \mu_2 = \left(e^{\sigma^2} - 1\right) e^{2\mu + \sigma^2} \]

28.10 3rd Centered Moment

\[ \mu_3 = \left(e^{\sigma^2} - 1\right)^2 \left(e^{\sigma^2} + 2\right) e^{3\mu + 3\sigma^2/2} \]

28.11 4th Centered Moment

\[ \mu_4 = e^{4\mu + 2\sigma^2}\!\left(e^{6\sigma^2} - 4e^{3\sigma^2} + 6e^{\sigma^2} - 3\right) \]

28.12 Expected Value

\[ \text{E}(X) = e^{\mu + \sigma^2/2} \]

28.13 Variance

\[ \text{V}(X) = \left(e^{\sigma^2} - 1\right) e^{2\mu + \sigma^2} \]

28.14 Median

\[ \text{Med}(X) = e^{\mu} \]

The median depends only on \(\mu\), not on \(\sigma\).

28.15 Mode

\[ \text{Mo}(X) = e^{\mu - \sigma^2} \]

The mode is always strictly less than the median, which is always strictly less than the mean, reflecting the right-skewed shape of the distribution.

28.16 Coefficient of Skewness

\[ g_1 = \left(e^{\sigma^2} + 2\right)\sqrt{e^{\sigma^2} - 1} \]

The Lognormal distribution is always positively skewed. As \(\sigma \to 0\), the skewness approaches 0 (the distribution becomes approximately symmetric around \(e^\mu\)).

28.17 Coefficient of Kurtosis

\[ g_2 = e^{4\sigma^2} + 2e^{3\sigma^2} + 3e^{2\sigma^2} - 3 \]

The Lognormal distribution always has Pearson kurtosis greater than 3 (\(g_2 > 3\) for all \(\sigma > 0\)), indicating heavier tails than the Normal distribution.

28.18 Parameter Estimation

Taking logarithms transforms the problem to Normal parameter estimation. The maximum likelihood estimators are:

\[ \hat{\mu} = \frac{1}{n}\sum_{i=1}^{n} \ln(x_i) \]

\[ \hat{\sigma}^2 = \frac{1}{n}\sum_{i=1}^{n} \left(\ln(x_i) - \hat{\mu}\right)^2 \]

Note that \(\hat{\sigma}^2\) is the biased MLE; the unbiased estimator uses denominator \(n-1\).

28.19 R Module

28.19.1 RFC

The Lognormal Distribution module is available in RFC under the menu “Distributions / Lognormal Distribution”.

28.19.2 Direct app link

https://shiny.wessa.net/lognormal/

28.19.3 R Code

The following code demonstrates Lognormal probability calculations:

# Probability density function: f(x)
dlnorm(x = 50000, meanlog = 10, sdlog = 0.5)

# Distribution function: P(X <= x)
plnorm(q = 50000, meanlog = 10, sdlog = 0.5)

# Quantile function: find x such that P(X <= x) = p
qlnorm(p = 0.5, meanlog = 10, sdlog = 0.5)

# Generate random Lognormal numbers
set.seed(42)
rlnorm(n = 10, meanlog = 10, sdlog = 0.5)

[1] 4.161469e-06
[1] 0.9494513
[1] 22026.47
 [1] 43716.43 16608.18 26411.75 30225.20 26960.66 20888.16 46899.44 21008.25
 [9] 60428.24 21346.50

To fit a Lognormal distribution to observed data:

library(MASS)

# Example: annual income data (euros)
set.seed(7)
incomes <- rlnorm(50, meanlog = 10, sdlog = 0.5)

# Maximum likelihood estimation
fit <- fitdistr(incomes, "lognormal")
print(fit)

# Compare with direct log-transform approach
cat("\nDirect MLE: meanlog =", mean(log(incomes)),
    "  sdlog =", sqrt(mean((log(incomes) - mean(log(incomes)))^2)), "\n")

     meanlog        sdlog   
  10.11935540    0.49955185 
 ( 0.07064730) ( 0.04995519)

Direct MLE: meanlog = 10.11936   sdlog = 0.4995519

28.20 Example

Annual household incomes in a region are modelled as Lognormal with \(\mu = 10\) (meanlog) and \(\sigma = 0.5\) (sdlog). The median income is \(e^{10} \approx 22{,}026\) EUR and the mean income is \(e^{10 + 0.25} \approx 28{,}403\) EUR. We compute several policy-relevant quantities:

mu_log <- 10
sd_log <- 0.5

# Median income
cat("Median income (EUR):", round(exp(mu_log)), "\n")

# Mean income
cat("Mean income (EUR):", round(exp(mu_log + sd_log^2 / 2)), "\n")

# P(income < 15000): share of households below a threshold
cat("P(income < 15000):", round(plnorm(15000, meanlog = mu_log, sdlog = sd_log), 4), "\n")

# 90th percentile income
cat("90th percentile (EUR):", round(qlnorm(0.9, meanlog = mu_log, sdlog = sd_log)), "\n")

Median income (EUR): 22026 
Mean income (EUR): 24959 
P(income < 15000): 0.2211 
90th percentile (EUR): 41805

Interactive Shiny app (click to load).

Open in new tab

28.21 Random Number Generator

By definition, if \(Z \sim \text{N}(\mu, \sigma^2)\) then \(X = e^Z \sim \text{LnN}(\mu, \sigma^2)\). Thus Lognormal random variates are generated by exponentiating Normal random variates:

\[ X = e^{\mu + \sigma Z_0}, \quad Z_0 \sim \text{N}(0,1) \]

set.seed(123)
n      <- 1000
mu_log <- 0
sd_log <- 0.5

# Via Normal transformation
z <- rnorm(n, mean = 0, sd = 1)
x_transformed <- exp(mu_log + sd_log * z)

# Built-in function
x_rlnorm <- rlnorm(n, meanlog = mu_log, sdlog = sd_log)

cat("Transformed: mean =", round(mean(x_transformed), 4),
    "  median =", round(median(x_transformed), 4), "\n")
cat("rlnorm():    mean =", round(mean(x_rlnorm), 4),
    "  median =", round(median(x_rlnorm), 4), "\n")
cat("Theoretical: mean =", round(exp(mu_log + sd_log^2/2), 4),
    "  median =", round(exp(mu_log), 4), "\n")

Transformed: mean = 1.1411   median = 1.0046 
rlnorm():    mean = 1.1598   median = 1.0278 
Theoretical: mean = 1.1331   median = 1

Code

set.seed(123)
x <- rlnorm(1000, meanlog = 0, sdlog = 0.5)
hist(x, breaks = 40, col = "steelblue", freq = FALSE,
     xlab = "x", main = "Lognormal Random Numbers (n = 1000, mu = 0, sigma = 0.5)")
curve(dlnorm(x, meanlog = 0, sdlog = 0.5), add = TRUE, col = "red", lwd = 2)
legend("topright", legend = "Theoretical density", col = "red", lwd = 2)

Figure 28.3: Histogram of simulated Lognormal random numbers (n = 1000, mu = 0, sigma = 0.5)

Interactive Shiny app (click to load).

Open in new tab

28.22 Property 1: Defining Relationship with the Normal Distribution

The Lognormal distribution is defined by its relationship to the Normal distribution: if \(Y \sim \text{N}(\mu, \sigma^2)\) then \(X = e^Y \sim \text{LnN}(\mu, \sigma^2)\). Equivalently, \(\ln(X) \sim \text{N}(\mu, \sigma^2)\) (see Chapter 20).

28.23 Property 2: Closure Under Multiplication

If \(X_1 \sim \text{LnN}(\mu_1, \sigma_1^2)\) and \(X_2 \sim \text{LnN}(\mu_2, \sigma_2^2)\) are independent, then their product is also Lognormal:

\[ X_1 X_2 \sim \text{LnN}(\mu_1 + \mu_2,\; \sigma_1^2 + \sigma_2^2) \]

This follows from the fact that \(\ln(X_1 X_2) = \ln X_1 + \ln X_2\) and the sum of independent Normals is Normal.

28.24 Property 3: Concentration as Shape Parameter Shrinks

As \(\sigma \to 0\), the Lognormal distribution degenerates: it concentrates all mass at \(e^\mu\). The mean, median, and mode all converge to \(e^\mu\), and the distribution becomes approximately Normal centered at \(e^\mu\).

28.1 Probability Density Function

28.2 Purpose

28.3 Distribution Function

28.4 Moment Generating Function

28.5 1st Uncentered Moment

28.6 2nd Uncentered Moment

28.7 3rd Uncentered Moment

28.8 4th Uncentered Moment

28.9 2nd Centered Moment

28.10 3rd Centered Moment

28.11 4th Centered Moment

28.12 Expected Value

28.13 Variance

28.14 Median

28.15 Mode

28.16 Coefficient of Skewness

28.17 Coefficient of Kurtosis

28.18 Parameter Estimation

28.19 R Module

28.19.1 RFC

28.19.2 Direct app link

28.19.3 R Code

28.20 Example

28.21 Random Number Generator

28.22 Property 1: Defining Relationship with the Normal Distribution

28.23 Property 2: Closure Under Multiplication

28.24 Property 3: Concentration as Shape Parameter Shrinks

28.25 Related Distributions 1: Defined via Normal

28.26 Related Distributions 2: Box-Cox Power Transform