• Descriptive
    • Moments
    • Concentration
    • Central Tendency
    • Variability
    • Stem-and-Leaf Plot
    • Histogram & Frequency Table
    • Data Quality Forensics
    • Conditional EDA
    • Quantiles
    • Kernel Density Estimation
    • Normal QQ Plot
    • Bootstrap Plot

    • Multivariate Descriptive Statistics
  • Distributions
    • Binomial Probabilities
    • Geometric Probabilities
    • Negative Binomial Probabilities
    • Hypergeometric Probabilities
    • Multinomial Probabilities
    • Dirichlet
    • Poisson Probabilities

    • Exponential
    • Gamma
    • Erlang
    • Weibull
    • Rayleigh
    • Maxwell-Boltzmann
    • Lognormal
    • Pareto
    • Inverse Gamma
    • Inverse Chi-Square

    • Beta
    • Power
    • Beta Prime (Inv. Beta)
    • Triangular

    • Normal (area)
    • Logistic
    • Laplace
    • Cauchy (standard)
    • Cauchy (location-scale)
    • Gumbel
    • Fréchet
    • Generalized Extreme Value

    • Normal RNG
    • ML Fitting
    • Tukey Lambda PPCC
    • Box-Cox Normality Plot
    • Noncentral t
    • Noncentral F
    • Sample Correlation r

    • Empirical Tests
  • Hypotheses
    • Theoretical Aspects of Hypothesis Testing
    • Bayesian Inference
    • Minimum Sample Size

    • Empirical Tests
    • Multivariate (pair-wise) Testing
  • Models
    • Manual Model Building
    • Guided Model Building
  • Time Series
    • Time Series Plot
    • Decomposition
    • Exponential Smoothing

    • Blocked Bootstrap Plot
    • Mean Plot
    • (P)ACF
    • VRM
    • Standard Deviation-Mean Plot
    • Spectral Analysis
    • ARIMA

    • Cross Correlation Function
    • Granger Causality
  1. Hypothesis Testing
  2. 113  Bayesian Inference for Decision-Making
  • Preface
  • Getting Started
    • 1  Introduction
    • 2  Why Do We Need Innovative Technology?
    • 3  Basic Definitions
    • 4  The Big Picture: Why We Analyze Data
  • Introduction to Probability
    • 5  Definitions of Probability
    • 6  Jeffreys’ axiom system
    • 7  Bayes’ Theorem
    • 8  Sensitivity and Specificity
    • 9  Naive Bayes Classifier
    • 10  Law of Large Numbers

    • 11  Problems
  • Probability Distributions
    • 12  Bernoulli Distribution
    • 13  Binomial Distribution
    • 14  Geometric Distribution
    • 15  Negative Binomial Distribution
    • 16  Hypergeometric Distribution
    • 17  Multinomial Distribution
    • 18  Poisson Distribution

    • 19  Uniform Distribution (Rectangular Distribution)
    • 20  Normal Distribution (Gaussian Distribution)
    • 21  Gaussian Naive Bayes Classifier
    • 22  Chi Distribution
    • 23  Chi-squared Distribution (1 parameter)
    • 24  Chi-squared Distribution (2 parameters)
    • 25  Student t-Distribution
    • 26  Fisher F-Distribution
    • 27  Exponential Distribution
    • 28  Lognormal Distribution
    • 29  Gamma Distribution
    • 30  Beta Distribution
    • 31  Weibull Distribution
    • 32  Pareto Distribution
    • 33  Inverse Gamma Distribution
    • 34  Rayleigh Distribution
    • 35  Erlang Distribution
    • 36  Logistic Distribution
    • 37  Laplace Distribution
    • 38  Gumbel Distribution
    • 39  Cauchy Distribution
    • 40  Triangular Distribution
    • 41  Power Distribution
    • 42  Beta Prime Distribution
    • 43  Sample Correlation Distribution
    • 44  Dirichlet Distribution
    • 45  Generalized Extreme Value (GEV) Distribution
    • 46  Frechet Distribution
    • 47  Noncentral t Distribution
    • 48  Noncentral F Distribution
    • 49  Inverse Chi-Squared Distribution
    • 50  Maxwell-Boltzmann Distribution
    • 51  Distribution Relationship Map

    • 52  Problems
  • Descriptive Statistics & Exploratory Data Analysis
    • 53  Types of Data
    • 54  Datasheets

    • 55  Frequency Plot (Bar Plot)
    • 56  Frequency Table
    • 57  Contingency Table
    • 58  Binomial Classification Metrics
    • 59  Confusion Matrix
    • 60  ROC Analysis

    • 61  Stem-and-Leaf Plot
    • 62  Histogram
    • 63  Data Quality Forensics
    • 64  Quantiles
    • 65  Central Tendency
    • 66  Variability
    • 67  Skewness & Kurtosis
    • 68  Concentration
    • 69  Notched Boxplot
    • 70  Scatterplot
    • 71  Pearson Correlation
    • 72  Rank Correlation
    • 73  Partial Pearson Correlation
    • 74  Simple Linear Regression
    • 75  Moments
    • 76  Quantile-Quantile Plot (QQ Plot)
    • 77  Normal Probability Plot
    • 78  Probability Plot Correlation Coefficient Plot (PPCC Plot)
    • 79  Box-Cox Normality Plot
    • 80  Kernel Density Estimation
    • 81  Bivariate Kernel Density Plot
    • 82  Conditional EDA: Panel Diagnostics
    • 83  Bootstrap Plot (Central Tendency)
    • 84  Survey Scores Rank Order Comparison
    • 85  Cronbach Alpha

    • 86  Equi-distant Time Series
    • 87  Time Series Plot (Run Sequence Plot)
    • 88  Mean Plot
    • 89  Blocked Bootstrap Plot (Central Tendency)
    • 90  Standard Deviation-Mean Plot
    • 91  Variance Reduction Matrix
    • 92  (Partial) Autocorrelation Function
    • 93  Periodogram & Cumulative Periodogram

    • 94  Problems
  • Hypothesis Testing
    • 95  Normal Distributions revisited
    • 96  The Population
    • 97  The Sample
    • 98  The One-Sided Hypothesis Test
    • 99  The Two-Sided Hypothesis Test
    • 100  When to use a one-sided or two-sided test?
    • 101  What if \(\sigma\) is unknown?
    • 102  The Central Limit Theorem (revisited)
    • 103  Statistical Test of the Population Mean with known Variance
    • 104  Statistical Test of the Population Mean with unknown Variance
    • 105  Statistical Test of the Variance
    • 106  Statistical Test of the Population Proportion
    • 107  Statistical Test of the Standard Deviation \(\sigma\)
    • 108  Statistical Test of the difference between Means -- Independent/Unpaired Samples
    • 109  Statistical Test of the difference between Means -- Dependent/Paired Samples
    • 110  Statistical Test of the difference between Variances -- Independent/Unpaired Samples

    • 111  Hypothesis Testing for Research Purposes
    • 112  Decision Thresholds, Alpha, and Confidence Levels
    • 113  Bayesian Inference for Decision-Making
    • 114  One Sample t-Test
    • 115  Skewness & Kurtosis Tests
    • 116  Paired Two Sample t-Test
    • 117  Wilcoxon Signed-Rank Test
    • 118  Unpaired Two Sample t-Test
    • 119  Unpaired Two Sample Welch Test
    • 120  Two One-Sided Tests (TOST) for Equivalence
    • 121  Mann-Whitney U test (Wilcoxon Rank-Sum Test)
    • 122  Bayesian Two Sample Test
    • 123  Median Test based on Notched Boxplots
    • 124  Chi-Squared Tests for Count Data
    • 125  Kolmogorov-Smirnov Test
    • 126  One Way Analysis of Variance (1-way ANOVA)
    • 127  Kruskal-Wallis Test
    • 128  Two Way Analysis of Variance (2-way ANOVA)
    • 129  Repeated Measures ANOVA
    • 130  Friedman Test
    • 131  Testing Correlations
    • 132  A Note on Causality

    • 133  Problems
  • Regression Models
    • 134  Simple Linear Regression Model (SLRM)
    • 135  Multiple Linear Regression Model (MLRM)
    • 136  Logistic Regression
    • 137  Generalized Linear Models
    • 138  Multinomial and Ordinal Logistic Regression
    • 139  Cox Proportional Hazards Regression
    • 140  Conditional Inference Trees
    • 141  Leaf Diagnostics for Conditional Inference Trees
    • 142  Conditional Random Forests
    • 143  Hypothesis Testing with Linear Regression Models (from a Practical Point of View)

    • 144  Problems
  • Introduction to Time Series Analysis
    • 145  Case: the Market of Health and Personal Care Products
    • 146  Decomposition of Time Series
    • 147  Ad hoc Forecasting of Time Series
  • Box-Jenkins Analysis
    • 148  Introduction to Box-Jenkins Analysis
    • 149  Theoretical Concepts
    • 150  Stationarity
    • 151  Identifying ARMA parameters
    • 152  Estimating ARMA Parameters and Residual Diagnostics
    • 153  Forecasting with ARIMA models
    • 154  Intervention Analysis
    • 155  Cross-Correlation Function
    • 156  Transfer Function Noise Models
    • 157  General-to-Specific Modeling
  • Model Building Strategies
    • 158  Introduction to Model Building Strategies
    • 159  Manual Model Building
    • 160  Model Validation
    • 161  Regularization Methods
    • 162  Hyperparameter Optimization Strategies
    • 163  Guided Model Building in Practice
    • 164  Diagnostics, Revision, and Guided Forecasting
    • 165  Leakage, Target Encoding, and Robust Regression
  • References
  • Appendices
    • Appendices
    • A  Method Selection Guide
    • B  Presentations and Teaching Materials
    • C  R Language Concepts for Statistical Computing
    • D  Matrix Algebra
    • E  Standard Normal Table (Gaussian Table)
    • F  Critical values of Student’s \(t\) distribution with \(\nu\) degrees of freedom
    • G  Upper-tail critical values of the \(\chi^2\)-distribution with \(\nu\) degrees of freedom
    • H  Lower-tail critical values of the \(\chi^2\)-distribution with \(\nu\) degrees of freedom

Table of contents

  • 113.1 Practical Relevance
  • 113.2 Beta-Binomial Update (Core Model)
    • 113.2.1 Conjugate Priors (what this means)
    • 113.2.2 Choosing a Beta prior in practice
  • 113.3 Decision Rule in Posterior Terms
  • 113.4 Bayes Factor and Bayes Error Rate
    • 113.4.1 Bayes Factor
    • 113.4.2 How large should BF10 be?
    • 113.4.3 Bayes Error Rate (Theoretical vs Operational)
  • 113.5 Business Example: Fraud-Rate Decision
    • 113.5.1 R snippet 1: posterior update and decision summary
    • 113.5.2 R snippet 2: prior vs posterior view
    • 113.5.3 R snippet 3: sensitivity to alternative reasonable priors
    • 113.5.4 R snippet 4: Bayes factor for a point-null comparison
    • 113.5.5 R snippet 5: practical Bayes decision error on scored cases
    • 113.5.6 Interactive app (same logic)
  • 113.6 Relation to Other Chapters
  • 113.7 Practical Takeaway
  • 113.8 Practical Exercises
  1. Hypothesis Testing
  2. 113  Bayesian Inference for Decision-Making

113  Bayesian Inference for Decision-Making

Bayesian inference is often a more natural decision framework than mechanical one-alpha testing because it reports probabilities directly in decision terms:

  • posterior probability of the claim,
  • posterior probability of being wrong if we act now,
  • and posterior uncertainty intervals.

This chapter connects the ideas in Chapter 7 and Chapter 30 to the threshold logic in Chapter 112.

113.1 Practical Relevance

In many applied settings, decision makers do not ask:

  • “Is the p-value below 5%?”

They ask:

  • “Given the data, how likely is the claim?”
  • “How much risk of being wrong remains if we act now?”
  • “What decision threshold is appropriate for this business context?”

Bayesian inference answers these directly through posterior probabilities.

113.2 Beta-Binomial Update (Core Model)

Suppose an unknown event probability is denoted by \(p\) (for example: fraud rate, conversion rate, default rate).
We observe \(n\) trials and count how many of them are events.

The notation is:

Symbol Meaning
\(p\) the unknown underlying event probability in the population or process
\(n\) the number of observed trials
\(Y\) the random variable for the number of events in those \(n\) trials before the data are observed
\(y\) the realized observed number of events after the data are collected

So \(y / n\) is the observed sample proportion, not the true probability itself. It is a data-based estimate of \(p\), but it is not equal to \(p\).

For Binomial data with \(y\) observed events in \(n\) trials:

\[ Y \mid p \sim \text{Binomial}(n, p) \]

and with a Beta prior:

\[ p \sim \text{Beta}(\alpha_0, \beta_0), \]

the posterior is:

\[ p \mid y \sim \text{Beta}(\alpha_0 + y,\; \beta_0 + n - y). \]

This is the practical bridge between:

  • Bayes’ theorem (Chapter 7),
  • the Beta distribution (Chapter 30),
  • and threshold-based decisions (Chapter 112).

113.2.1 Conjugate Priors (what this means)

A prior is called conjugate for a likelihood when the posterior stays in the same distribution family after updating with data.

In this chapter:

  • prior: \(p \sim \text{Beta}(\alpha_0,\beta_0)\)
  • likelihood: \(Y\mid p \sim \text{Binomial}(n,p)\)
  • posterior: \(p\mid y \sim \text{Beta}(\alpha_0+y,\beta_0+n-y)\)

This matters because updating becomes transparent, fast, and easy to explain in teaching and practice.

113.2.2 Choosing a Beta prior in practice

The Beta prior is not just a technical convenience. It is also easy to interpret:

\[ E(p) = \frac{\alpha_0}{\alpha_0 + \beta_0}. \]

So:

  • the ratio \(\alpha_0 / (\alpha_0 + \beta_0)\) sets your best prior guess for the event rate,
  • the total \(\alpha_0 + \beta_0\) controls how strongly that prior pulls on the posterior.

In practice, you can think of \(\alpha_0 + \beta_0\) as the prior strength in pseudo-observations. A prior such as \(\text{Beta}(3,97)\) says: “before seeing the new data, I expect about 3% events, and I hold that view with roughly the weight of 100 prior observations.”

This is exactly why \(\text{Beta}(3,97)\), \(\text{Beta}(30,970)\), and \(\text{Beta}(300,9700)\) are not equivalent even though they all have prior mean \(0.03\):

  • \(\text{Beta}(3,97)\) has prior strength \(100\),
  • \(\text{Beta}(30,970)\) has prior strength \(1000\),
  • \(\text{Beta}(300,9700)\) has prior strength \(10000\).

All three priors say “I expect about 3% events,” but they differ in how stubborn that belief is. A stronger prior is harder for the new data to move. It is therefore useful to separate two ideas:

  • the mean tells you where the prior is centered,
  • the strength tells you how strongly it resists being updated.

113.3 Decision Rule in Posterior Terms

Define a business-relevant claim, for example \(p < p_0\) or \(p > p_0\).
Here \(p_0\) is the decision threshold that separates acceptable from unacceptable values of the event probability. For example, if management says “the fraud rate must be below 2%,” then \(p_0 = 0.02\).

Then compute the posterior probability of that claim:

\[ q = P(\text{claim} \mid \text{data}). \]

Choose a decision threshold \(\tau\) by context (confirmatory, balanced, diagnostic):

\[ \text{Support claim if } q \ge \tau. \]

This makes the threshold explicit and interpretable:

  • strict settings: high \(\tau\) (e.g. 0.98),
  • diagnostic/screening settings: lower \(\tau\) (e.g. 0.80).

113.4 Bayes Factor and Bayes Error Rate

113.4.1 Bayes Factor

For point-null comparison, we can compare two competing statistical statements:

  • \(H_0\): the event probability is fixed exactly at the threshold value, \(p = p_0\),
  • \(H_1\): the event probability is not fixed at one value; instead, many plausible values of \(p\) are allowed, weighted by a Beta prior.

So when this chapter says “\(H_1\) with a Beta prior,” it means that under \(H_1\) we do not commit to one single value of \(p\). We spread prior belief across a whole range of possible values of \(p\) using a Beta distribution.

The Bayes factor is then:

\[ \text{BF}_{10} = \frac{P(\text{data}\mid H_1)}{P(\text{data}\mid H_0)}. \]

It quantifies relative evidence between models, while \(P(p < p_0 \mid \text{data})\) answers a directional posterior claim about the parameter.
These are related but distinct: \(\text{BF}_{10}\) compares \(H_0: p=p_0\) to the full \(H_1\) prior (all plausible \(p\) values), so \(\text{BF}_{10}\) and \(P(p < p_0 \mid \text{data})\) can point in different directions.

113.4.2 How large should BF10 be?

Students often ask for a single universal cutoff (for example: BF10 > 3 or > 10).
This is usually the wrong first question for the same reason that one fixed alpha is problematic.

A better workflow is:

  1. specify the decision context and the costs of errors,
  2. report the Bayes factor and posterior model probability,
  3. show sensitivity to alternative reasonable priors,
  4. then decide whether evidence is sufficient for the concrete decision.

So Bayes-factor thresholds are context guides, not universal constants.

For orientation, however, students often benefit from a first reading scale:

\(\text{BF}_{10}\) Rough classroom interpretation
about 1 little separation between \(H_1\) and \(H_0\)
3 to 10 moderate support for \(H_1\)
greater than 10 strong support for \(H_1\)
below 1 evidence tilts toward \(H_0\) instead

This table is only a starting point. It should never replace context, prior sensitivity, and a clear statement of the actual decision threshold.

113.4.3 Bayes Error Rate (Theoretical vs Operational)

In binary decision settings, two different quantities are often called “Bayes error”:

  • Theoretical Bayes error rate: minimum achievable misclassification probability under the true class-conditional distributions (not directly computable in practical finite-data settings).
  • Operational posterior decision error: threshold-dependent local decision error computed from posterior class probabilities.

The reason both appear in the literature is that they answer different questions:

  • the theoretical quantity is a benchmark or lower bound,
  • the operational quantity is what you can actually compute from model-based posterior scores in practice.

For the operational quantity, once posterior class probabilities are available, each case has a local decision error probability:

  • if classify as positive: \(1 - P(\text{positive}\mid x)\),
  • if classify as negative: \(P(\text{positive}\mid x)\).

The average of these local error probabilities over evaluated cases provides a practical, threshold-dependent decision error summary.
The R snippet below reports this operational quantity.

113.5 Business Example: Fraud-Rate Decision

Assume a payment team deploys a new rule and wants to support the operational claim:

\[ p < 0.02 \]

where \(p\) is the underlying fraud rate after deployment.

Prior belief from historical process knowledge:

\[ p \sim \text{Beta}(3, 97) \]

(prior mean \(= \alpha_0 / (\alpha_0 + \beta_0) = 3 / 100 = 3\%\)).
Observed in a recent batch: \(y=7\) fraud events out of \(n=400\).

The observed fraud rate in the batch is

\[ \hat{p} = \frac{7}{400} = 0.0175, \]

which is below the target threshold of 2%. Even so, the posterior probability of the claim will not automatically be high, because the prior mean was 3% and 400 observations do not completely overwhelm that prior.

113.5.1 R snippet 1: posterior update and decision summary

# Prior and data
alpha0 <- 3
beta0  <- 97
n <- 400
y <- 7
p0 <- 0.02
tau <- 0.90   # decision threshold for P(p < p0 | data)

# Posterior
alpha_post <- alpha0 + y
beta_post  <- beta0 + n - y

prior_mean <- alpha0 / (alpha0 + beta0)
observed_rate <- y / n
post_mean <- alpha_post / (alpha_post + beta_post)
post_prob_claim <- pbeta(p0, shape1 = alpha_post, shape2 = beta_post)  # P(p < p0 | data)
ci90 <- qbeta(c(0.05, 0.95), shape1 = alpha_post, shape2 = beta_post)

decision <- if (post_prob_claim >= tau) "Support claim p < 0.02" else "Do not support claim yet"

if (post_prob_claim >= tau) {
  risk_label <- "Probability of decision error if we support the claim now"
  risk_value <- 1 - post_prob_claim
} else {
  risk_label <- "Probability the claim is true even though we do not support it yet"
  risk_value <- post_prob_claim
}

summary_tab <- data.frame(
  quantity = c(
    "Prior mean",
    "Observed rate",
    "Posterior mean",
    "P(p < 0.02 | data)",
    "90% credible interval",
    "Decision",
    risk_label
  ),
  value = c(
    sprintf("%.4f", prior_mean),
    sprintf("%.4f", observed_rate),
    sprintf("%.4f", post_mean),
    sprintf("%.4f", post_prob_claim),
    sprintf("[%.4f, %.4f]", ci90[1], ci90[2]),
    decision,
    sprintf("%.4f", risk_value)
  )
)

knitr::kable(summary_tab, col.names = c("Summary", "Value"))
Summary Value
Prior mean 0.0300
Observed rate 0.0175
Posterior mean 0.0200
P(p < 0.02 | data) 0.5408
90% credible interval [0.0109, 0.0313]
Decision Do not support claim yet
Probability the claim is true even though we do not support it yet 0.5408

113.5.2 R snippet 2: prior vs posterior view

x <- seq(0, 0.06, length.out = 600)
prior_dens <- dbeta(x, alpha0, beta0)
post_dens <- dbeta(x, alpha_post, beta_post)

plot(x, prior_dens, type = "l", lwd = 2, col = "steelblue",
     xlab = "Fraud rate p", ylab = "Density", main = "")
lines(x, post_dens, lwd = 2, col = "firebrick")
abline(v = p0, lty = 2, lwd = 2, col = "grey40")

shade_x <- x[x <= p0]
shade_y <- post_dens[x <= p0]
polygon(c(shade_x, rev(shade_x)), c(shade_y, rep(0, length(shade_y))),
        col = adjustcolor("firebrick", alpha.f = 0.20), border = NA)

legend("topright",
       legend = c("prior", "posterior", "threshold p = 0.02"),
       col = c("steelblue", "firebrick", "grey40"),
       lty = c(1, 1, 2), lwd = c(2, 2, 2), bty = "n", cex = 0.85)

Prior and posterior Beta densities for the fraud-rate example. The vertical line marks the operational claim threshold p = 0.02.

The shaded posterior area to the left of \(p_0 = 0.02\) is exactly the probability used in the decision rule:

\[ P(p < 0.02 \mid \text{data}) \approx 0.54. \]

That makes the decision logic easier to interpret: even though the observed rate is below 2%, the posterior mass below 2% is still only about 54%, far below the decision threshold \(\tau = 0.90\).

113.5.3 R snippet 3: sensitivity to alternative reasonable priors

prior_grid <- data.frame(
  prior = c("Uniform prior", "Historical prior", "Stronger historical prior"),
  alpha0 = c(1, 3, 6),
  beta0  = c(1, 97, 194)
)

sensitivity_tab <- do.call(
  rbind,
  lapply(seq_len(nrow(prior_grid)), function(i) {
    a0 <- prior_grid$alpha0[i]
    b0 <- prior_grid$beta0[i]
    a1 <- a0 + y
    b1 <- b0 + n - y
    q_claim <- pbeta(p0, a1, b1)
    data.frame(
      prior = prior_grid$prior[i],
      prior_mean = round(a0 / (a0 + b0), 4),
      posterior_mean = round(a1 / (a1 + b1), 4),
      `P(p < 0.02 | data)` = round(q_claim, 4),
      decision_at_tau_0.90 = ifelse(q_claim >= tau, "support", "do not support")
    )
  })
)

knitr::kable(sensitivity_tab)
prior prior_mean posterior_mean P.p…0.02…data. decision_at_tau_0.90
Uniform prior 0.50 0.0199 0.5513 do not support
Historical prior 0.03 0.0200 0.5408 do not support
Stronger historical prior 0.03 0.0217 0.4217 do not support

This table is not an argument for changing priors until a desired decision appears. It is a reminder that Bayesian decisions should be accompanied by reasonable prior sensitivity checks.

113.5.4 R snippet 4: Bayes factor for a point-null comparison

# Bayes factor H0: p = p0 versus H1: p ~ Beta(alpha0, beta0)
log_m1 <- lchoose(n, y) + lbeta(alpha_post, beta_post) - lbeta(alpha0, beta0)
log_m0 <- lchoose(n, y) + y * log(p0) + (n - y) * log(1 - p0)
bf10 <- exp(log_m1 - log_m0)

cat("BF10 (H1 vs H0: p = 0.02) =", round(bf10, 6), "\n")
BF10 (H1 vs H0: p = 0.02) = 0.428271 

Here the Bayes factor is slightly below 1, so it tilts mildly toward the point null \(H_0: p = 0.02\). That does not contradict the posterior claim probability above. The two quantities answer different questions:

  • the posterior claim probability asks whether \(p\) is below 0.02,
  • the Bayes factor compares one exact point null against the whole alternative prior.

113.5.5 R snippet 5: practical Bayes decision error on scored cases

# The fraud-rate example above concerned one population parameter p.
# In a scoring system, we instead work with posterior probabilities for individual cases.
# The values below are illustrative case-level scores; they are not computed from the Beta-Binomial example above.
post_fraud <- c(0.03, 0.08, 0.12, 0.19, 0.27, 0.44, 0.61, 0.74, 0.86, 0.93)
threshold <- 0.25

pred_label <- ifelse(post_fraud >= threshold, "fraud", "legit")
local_error <- ifelse(pred_label == "fraud", 1 - post_fraud, post_fraud)

cat("Average local decision error probability =", round(mean(local_error), 5), "\n")
Average local decision error probability = 0.257 

113.5.6 Interactive app (same logic)

Interactive Shiny app (click to load).
Open in new tab

Try the app with the following changes:

  • increase \(n\) while keeping the observed fraud rate roughly constant, and watch the posterior tighten,
  • replace \(\text{Beta}(3,97)\) by \(\text{Beta}(1,1)\) and compare the decision,
  • switch the direction from \(p < p_0\) to \(p > p_0\) and see how the posterior claim probability changes.

113.6 Relation to Other Chapters

  • For theorem-level foundations: Chapter 7
  • For Beta prior/posterior mechanics: Chapter 30
  • For threshold selection across contexts: Chapter 112
  • For posterior analysis in two-sample settings: Chapter 122

113.7 Practical Takeaway

Bayesian inference does not remove the need for threshold choice. It improves the workflow by making thresholds explicit in posterior probability terms and by reporting decision risk directly.

113.8 Practical Exercises

  1. Recompute the fraud example with \(\text{Beta}(1,1)\) instead of \(\text{Beta}(3,97)\). How much does \(P(p < 0.02 \mid \text{data})\) change?
  2. Keep the prior \(\text{Beta}(3,97)\) but change the batch to \(y = 20\) frauds out of \(n = 1000\). Does the claim \(p < 0.02\) now reach a 90% decision threshold?
  3. In the scored-case example, raise the threshold from 0.25 to 0.60. What happens to the average local decision error?
  4. Explain in your own words why \(\text{BF}_{10}\) and \(P(p < p_0 \mid \text{data})\) are not the same quantity, even when they are based on the same observed data.
112  Decision Thresholds, Alpha, and Confidence Levels
114  One Sample t-Test

© 2026 Patrick Wessa. Provided as-is, without warranty.

Feedback: e-mail | Anonymous contributions: click to copy (Sats) | click to copy (XMR)

Cookie Preferences