• Descriptive
    • Moments
    • Concentration
    • Central Tendency
    • Variability
    • Stem-and-Leaf Plot
    • Histogram & Frequency Table
    • Data Quality Forensics
    • Conditional EDA
    • Quantiles
    • Kernel Density Estimation
    • Normal QQ Plot
    • Bootstrap Plot

    • Multivariate Descriptive Statistics
  • Distributions
    • Binomial Probabilities
    • Geometric Probabilities
    • Negative Binomial Probabilities
    • Hypergeometric Probabilities
    • Multinomial Probabilities
    • Dirichlet
    • Poisson Probabilities

    • Exponential
    • Gamma
    • Erlang
    • Weibull
    • Rayleigh
    • Maxwell-Boltzmann
    • Lognormal
    • Pareto
    • Inverse Gamma
    • Inverse Chi-Square

    • Beta
    • Power
    • Beta Prime (Inv. Beta)
    • Triangular

    • Normal (area)
    • Logistic
    • Laplace
    • Cauchy (standard)
    • Cauchy (location-scale)
    • Gumbel
    • Fréchet
    • Generalized Extreme Value

    • Normal RNG
    • ML Fitting
    • Tukey Lambda PPCC
    • Box-Cox Normality Plot
    • Noncentral t
    • Noncentral F
    • Sample Correlation r

    • Empirical Tests
  • Hypotheses
    • Theoretical Aspects of Hypothesis Testing
    • Bayesian Inference
    • Minimum Sample Size

    • Empirical Tests
    • Multivariate (pair-wise) Testing
  • Models
    • Manual Model Building
    • Guided Model Building
  • Time Series
    • Time Series Plot
    • Decomposition
    • Exponential Smoothing

    • Blocked Bootstrap Plot
    • Mean Plot
    • (P)ACF
    • VRM
    • Standard Deviation-Mean Plot
    • Spectral Analysis
    • ARIMA

    • Cross Correlation Function
    • Granger Causality
  1. Probability Distributions
  2. 13  Binomial Distribution
  • Preface
  • Getting Started
    • 1  Introduction
    • 2  Why Do We Need Innovative Technology?
    • 3  Basic Definitions
    • 4  The Big Picture: Why We Analyze Data
  • Introduction to Probability
    • 5  Definitions of Probability
    • 6  Jeffreys’ axiom system
    • 7  Bayes’ Theorem
    • 8  Sensitivity and Specificity
    • 9  Naive Bayes Classifier
    • 10  Law of Large Numbers

    • 11  Problems
  • Probability Distributions
    • 12  Bernoulli Distribution
    • 13  Binomial Distribution
    • 14  Geometric Distribution
    • 15  Negative Binomial Distribution
    • 16  Hypergeometric Distribution
    • 17  Multinomial Distribution
    • 18  Poisson Distribution

    • 19  Uniform Distribution (Rectangular Distribution)
    • 20  Normal Distribution (Gaussian Distribution)
    • 21  Gaussian Naive Bayes Classifier
    • 22  Chi Distribution
    • 23  Chi-squared Distribution (1 parameter)
    • 24  Chi-squared Distribution (2 parameters)
    • 25  Student t-Distribution
    • 26  Fisher F-Distribution
    • 27  Exponential Distribution
    • 28  Lognormal Distribution
    • 29  Gamma Distribution
    • 30  Beta Distribution
    • 31  Weibull Distribution
    • 32  Pareto Distribution
    • 33  Inverse Gamma Distribution
    • 34  Rayleigh Distribution
    • 35  Erlang Distribution
    • 36  Logistic Distribution
    • 37  Laplace Distribution
    • 38  Gumbel Distribution
    • 39  Cauchy Distribution
    • 40  Triangular Distribution
    • 41  Power Distribution
    • 42  Beta Prime Distribution
    • 43  Sample Correlation Distribution
    • 44  Dirichlet Distribution
    • 45  Generalized Extreme Value (GEV) Distribution
    • 46  Frechet Distribution
    • 47  Noncentral t Distribution
    • 48  Noncentral F Distribution
    • 49  Inverse Chi-Squared Distribution
    • 50  Maxwell-Boltzmann Distribution
    • 51  Distribution Relationship Map

    • 52  Problems
  • Descriptive Statistics & Exploratory Data Analysis
    • 53  Types of Data
    • 54  Datasheets

    • 55  Frequency Plot (Bar Plot)
    • 56  Frequency Table
    • 57  Contingency Table
    • 58  Binomial Classification Metrics
    • 59  Confusion Matrix
    • 60  ROC Analysis

    • 61  Stem-and-Leaf Plot
    • 62  Histogram
    • 63  Data Quality Forensics
    • 64  Quantiles
    • 65  Central Tendency
    • 66  Variability
    • 67  Skewness & Kurtosis
    • 68  Concentration
    • 69  Notched Boxplot
    • 70  Scatterplot
    • 71  Pearson Correlation
    • 72  Rank Correlation
    • 73  Partial Pearson Correlation
    • 74  Simple Linear Regression
    • 75  Moments
    • 76  Quantile-Quantile Plot (QQ Plot)
    • 77  Normal Probability Plot
    • 78  Probability Plot Correlation Coefficient Plot (PPCC Plot)
    • 79  Box-Cox Normality Plot
    • 80  Kernel Density Estimation
    • 81  Bivariate Kernel Density Plot
    • 82  Conditional EDA: Panel Diagnostics
    • 83  Bootstrap Plot (Central Tendency)
    • 84  Survey Scores Rank Order Comparison
    • 85  Cronbach Alpha

    • 86  Equi-distant Time Series
    • 87  Time Series Plot (Run Sequence Plot)
    • 88  Mean Plot
    • 89  Blocked Bootstrap Plot (Central Tendency)
    • 90  Standard Deviation-Mean Plot
    • 91  Variance Reduction Matrix
    • 92  (Partial) Autocorrelation Function
    • 93  Periodogram & Cumulative Periodogram

    • 94  Problems
  • Hypothesis Testing
    • 95  Normal Distributions revisited
    • 96  The Population
    • 97  The Sample
    • 98  The One-Sided Hypothesis Test
    • 99  The Two-Sided Hypothesis Test
    • 100  When to use a one-sided or two-sided test?
    • 101  What if \(\sigma\) is unknown?
    • 102  The Central Limit Theorem (revisited)
    • 103  Statistical Test of the Population Mean with known Variance
    • 104  Statistical Test of the Population Mean with unknown Variance
    • 105  Statistical Test of the Variance
    • 106  Statistical Test of the Population Proportion
    • 107  Statistical Test of the Standard Deviation \(\sigma\)
    • 108  Statistical Test of the difference between Means -- Independent/Unpaired Samples
    • 109  Statistical Test of the difference between Means -- Dependent/Paired Samples
    • 110  Statistical Test of the difference between Variances -- Independent/Unpaired Samples

    • 111  Hypothesis Testing for Research Purposes
    • 112  Decision Thresholds, Alpha, and Confidence Levels
    • 113  Bayesian Inference for Decision-Making
    • 114  One Sample t-Test
    • 115  Skewness & Kurtosis Tests
    • 116  Paired Two Sample t-Test
    • 117  Wilcoxon Signed-Rank Test
    • 118  Unpaired Two Sample t-Test
    • 119  Unpaired Two Sample Welch Test
    • 120  Two One-Sided Tests (TOST) for Equivalence
    • 121  Mann-Whitney U test (Wilcoxon Rank-Sum Test)
    • 122  Bayesian Two Sample Test
    • 123  Median Test based on Notched Boxplots
    • 124  Chi-Squared Tests for Count Data
    • 125  Kolmogorov-Smirnov Test
    • 126  One Way Analysis of Variance (1-way ANOVA)
    • 127  Kruskal-Wallis Test
    • 128  Two Way Analysis of Variance (2-way ANOVA)
    • 129  Repeated Measures ANOVA
    • 130  Friedman Test
    • 131  Testing Correlations
    • 132  A Note on Causality

    • 133  Problems
  • Regression Models
    • 134  Simple Linear Regression Model (SLRM)
    • 135  Multiple Linear Regression Model (MLRM)
    • 136  Logistic Regression
    • 137  Generalized Linear Models
    • 138  Multinomial and Ordinal Logistic Regression
    • 139  Cox Proportional Hazards Regression
    • 140  Conditional Inference Trees
    • 141  Leaf Diagnostics for Conditional Inference Trees
    • 142  Conditional Random Forests
    • 143  Hypothesis Testing with Linear Regression Models (from a Practical Point of View)

    • 144  Problems
  • Introduction to Time Series Analysis
    • 145  Case: the Market of Health and Personal Care Products
    • 146  Decomposition of Time Series
    • 147  Ad hoc Forecasting of Time Series
  • Box-Jenkins Analysis
    • 148  Introduction to Box-Jenkins Analysis
    • 149  Theoretical Concepts
    • 150  Stationarity
    • 151  Identifying ARMA parameters
    • 152  Estimating ARMA Parameters and Residual Diagnostics
    • 153  Forecasting with ARIMA models
    • 154  Intervention Analysis
    • 155  Cross-Correlation Function
    • 156  Transfer Function Noise Models
    • 157  General-to-Specific Modeling
  • Model Building Strategies
    • 158  Introduction to Model Building Strategies
    • 159  Manual Model Building
    • 160  Model Validation
    • 161  Regularization Methods
    • 162  Hyperparameter Optimization Strategies
    • 163  Guided Model Building in Practice
    • 164  Diagnostics, Revision, and Guided Forecasting
    • 165  Leakage, Target Encoding, and Robust Regression
  • References
  • Appendices
    • Appendices
    • A  Method Selection Guide
    • B  Presentations and Teaching Materials
    • C  R Language Concepts for Statistical Computing
    • D  Matrix Algebra
    • E  Standard Normal Table (Gaussian Table)
    • F  Critical values of Student’s \(t\) distribution with \(\nu\) degrees of freedom
    • G  Upper-tail critical values of the \(\chi^2\)-distribution with \(\nu\) degrees of freedom
    • H  Lower-tail critical values of the \(\chi^2\)-distribution with \(\nu\) degrees of freedom

Table of contents

  • 13.1 Definition
  • 13.2 Distribution Function
  • 13.3 Mean
  • 13.4 Mode
  • 13.5 Median
  • 13.6 Variance
  • 13.7 Moment Generating Function
  • 13.8 Coefficient of Skewness
  • 13.9 Coefficient of Kurtosis
  • 13.10 Parameter Estimation
  • 13.11 R Module
  • 13.12 Example
  • 13.13 Additional Academic Example: Vaccine Response Count
  1. Probability Distributions
  2. 13  Binomial Distribution

13  Binomial Distribution

13.1 Definition

The Binomial distribution answers the question: “what is the probability of exactly \(r\) successes after \(n\) independent Bernoulli trials with constant success probability \(p\)?”

\[ \text{P}(X = r) = \begin{cases} \binom{n}{r} p^r q^{n-r}, & r \in \{0,1,\dots,n\} \\ 0, & \text{otherwise} \end{cases} \]

where

\[ \binom{n}{r} = \frac{n!}{(n-r)!r!}, \quad 0 \le r \le n \]

and \(p\) = probability of success, \(q\) = probability of failure, \(p + q = 1\), \(n\) = number of independent draws, and \(X\) = number of successes.

In other words, the Binomial distribution describes the probability of \(X=r\) successes when \(n\) binary experiments are carried out independently with fixed success probability \(p\).

Important contrast: if sampling is done without replacement from a finite population, the exact model is Hypergeometric (see Chapter 16), not Binomial.

13.2 Distribution Function

For integer \(k \in \{0,1,\dots,n\}\):

\[ F(k)=\text{P}(X \le k)=\sum_{r=0}^{k}\binom{n}{r}p^r q^{n-r}. \]

13.3 Mean

\[ \text{E}(X) = n p \]

13.4 Mode

\[ \text{Mo}(X) = \begin{cases} \lfloor (n+1)p \rfloor & \text{if } (n+1)p \notin \mathbb{N} \\ (n+1)p - 1 \text{ and } (n+1)p & \text{if } (n+1)p \in \mathbb{N} \end{cases} \]

When \((n+1)p \in \mathbb{N}\), the distribution is bimodal: two adjacent values are tied for the highest probability.

13.5 Median

\[ \text{Med}(X) \approx np \]

More precisely, a binomial median is in \(\{\lfloor np \rfloor,\lceil np \rceil\}\). In some parameter settings the median is unique; in others both nearby integers satisfy the median condition.

13.6 Variance

\[ \text{V}(X) = n p q \]

Variance is maximal at \(p=0.5\) and approaches 0 as \(p \to 0\) or \(p \to 1\).

13.7 Moment Generating Function

\[ M_X(t) = (q + pe^t)^n \]

13.8 Coefficient of Skewness

\[ g_1 = \frac{q-p}{\sqrt{npq}} \]

13.9 Coefficient of Kurtosis

\[ g_2 = 3 + \frac{1-6pq}{npq} \]

The corresponding excess kurtosis is \(\frac{1-6pq}{npq}\).

13.10 Parameter Estimation

For one observed binomial count \(X\) with known trial size \(n\), the maximum-likelihood estimator is

\[ \hat p = \frac{X}{n}. \]

For a sample \(X_1,\dots,X_m\) of binomial counts with common \((n,p)\):

\[ \hat p = \frac{\bar X}{n}. \]

13.11 R Module

The Binomial Probabilities software can be found on the public website:

  • https://compute.wessa.net/rwasp_binomial.wasp

The Binomial Probabilities are also available in the menu “Distributions / Binomial Probabilities” of this handbook.

If you prefer to compute Binomial Probabilities on your local computer, the following code can be pasted into the R console:

success_threshold <- 1  # evaluate probabilities at this success count
n_trials <- 10          # number of Bernoulli trials
p_success <- 0.5        # success probability
r <- pbinom(success_threshold, n_trials, p_success)
print('Probabilities of the Binomial Distribution')
print(paste('P(X <= ', success_threshold, ') = ', r, sep=''))
print(paste('P(X > ', success_threshold, ') = ', 1-r, sep=''))
print(paste('P(X = ', success_threshold, ') = ', dbinom(success_threshold, n_trials, p_success), sep=''))
[1] "Probabilities of the Binomial Distribution"
[1] "P(X <= 1) = 0.0107421875"
[1] "P(X > 1) = 0.9892578125"
[1] "P(X = 1) = 0.00976562499999999"

Observe how we define three parameters (success_threshold, n_trials, and p_success) which represent the user input fields of the R module. In fact, all the R scripts displayed in this handbook correspond (to a certain degree) to the R modules that are available on the web.

The main functions that are used are pbinom (distribution function / CDF) and dbinom (probability mass function / PMF). In R, the d prefix is used for both continuous densities and discrete mass functions.

13.12 Example

Let us reconsider the hospital-birth simulation from Section 11.2.1, where sampling variability is compared across different hospital sizes. The number of trials \(n\) is the average number of births in a hospital. The number of “successes” \(r\) to be evaluated is 60% of \(n\). The probability of a “success” is \(p = 0.5\).

For the large hospital \(n = 45\) and \(0.60 n = 0.60 * 45 = 27\). Hence, P\((X \leq 27) \simeq 0.9324\) and P\((X > 27) \simeq 1 - 0.9324 \simeq 0.0676\). You can verify this result with the R console

success_threshold <- 0.6 * 45  # number of successes to be evaluated
n_trials <- 45                 # average number of births in the large hospital
p_success <- 0.5               # probability of a success
r <- pbinom(success_threshold, n_trials, p_success)
print('Probabilities of the Binomial Distribution')
print(paste('P(X <=', success_threshold, ') = ', r, sep=''))
print(paste('P(X >', success_threshold, ') = ', 1-r, sep=''))
print(paste('P(X =', success_threshold, ') = ', dbinom(success_threshold, n_trials, p_success), sep=''))
[1] "Probabilities of the Binomial Distribution"
[1] "P(X <=27) = 0.932421774577165"
[1] "P(X >27) = 0.0675782254228353"
[1] "P(X =27) = 0.0487683705313201"

or with the online R application

Interactive Shiny app (click to load).
Open in new tab

13.13 Additional Academic Example: Vaccine Response Count

In a pilot vaccine study, \(n=30\) participants are independently assessed for seroconversion, with historical response probability \(p=0.6\). What is the probability that at least 22 participants respond?

n_participants <- 30
p_response <- 0.6

cat("P(X >= 22) =",
    1 - pbinom(21, size = n_participants, prob = p_response), "\n")
P(X >= 22) = 0.09401122 
12  Bernoulli Distribution
14  Geometric Distribution

© 2026 Patrick Wessa. Provided as-is, without warranty.

Feedback: e-mail | Anonymous contributions: click to copy (Sats) | click to copy (XMR)

Cookie Preferences