Descriptive
Moments
Concentration
Central Tendency
Variability
Stem-and-Leaf Plot
Histogram & Frequency Table
Data Quality Forensics
Conditional EDA
Quantiles
Kernel Density Estimation
Normal QQ Plot
Bootstrap Plot
Multivariate Descriptive Statistics
Distributions
Binomial Probabilities
Geometric Probabilities
Negative Binomial Probabilities
Hypergeometric Probabilities
Multinomial Probabilities
Poisson Probabilities
Exponential
Gamma
Erlang
Weibull
Rayleigh
Lognormal
Pareto
Inverse Gamma
Beta
Power
Beta Prime (Inv. Beta)
Triangular
Normal (area)
Logistic
Laplace
Cauchy (standard)
Cauchy (location-scale)
Gumbel
Normal RNG
ML Fitting
Tukey Lambda PPCC
Box-Cox Normality Plot
Sample Correlation r
Empirical Tests
Hypotheses
Theoretical Aspects of Hypothesis Testing
Bayesian Inference
Minimum Sample Size
Empirical Tests
Multivariate (pair-wise) Testing
Models
Manual Model Building
Time Series
Time Series Plot
Decomposition
Exponential Smoothing
Blocked Bootstrap Plot
Mean Plot
(P)ACF
VRM
Standard Deviation-Mean Plot
Spectral Analysis
ARIMA
Cross Correlation Function
Granger Causality
Appendices
B
Presentations and Teaching Materials
Preface
Getting Started
1
Introduction
2
Why Do We Need Innovative Technology?
3
Basic Definitions
4
The Big Picture: Why We Analyze Data
Introduction to Probability
5
Definitions of Probability
6
Jeffreys’ axiom system
7
Bayes’ Theorem
8
Sensitivity and Specificity
9
Naive Bayes Classifier
10
Law of Large Numbers
11
Problems
Probability Distributions
12
Bernoulli Distribution
13
Binomial Distribution
14
Geometric Distribution
15
Negative Binomial Distribution
16
Hypergeometric Distribution
17
Multinomial Distribution
18
Poisson Distribution
19
Uniform Distribution (Rectangular Distribution)
20
Normal Distribution (Gaussian Distribution)
21
Gaussian Naive Bayes Classifier
22
Chi Distribution
23
Chi-squared Distribution (1 parameter)
24
Chi-squared Distribution (2 parameters)
25
Student t-Distribution
26
Fisher F-Distribution
27
Exponential Distribution
28
Lognormal Distribution
29
Gamma Distribution
30
Beta Distribution
31
Weibull Distribution
32
Pareto Distribution
33
Inverse Gamma Distribution
34
Rayleigh Distribution
35
Erlang Distribution
36
Logistic Distribution
37
Laplace Distribution
38
Gumbel Distribution
39
Cauchy Distribution
40
Triangular Distribution
41
Power Distribution
42
Beta Prime Distribution
43
Sample Correlation Distribution
44
Dirichlet Distribution
45
Generalized Extreme Value (GEV) Distribution
46
Frechet Distribution
47
Noncentral t Distribution
48
Noncentral F Distribution
49
Inverse Chi-Squared Distribution
50
Maxwell-Boltzmann Distribution
51
Distribution Relationship Map
52
Problems
Descriptive Statistics & Exploratory Data Analysis
53
Types of Data
54
Datasheets
55
Frequency Plot (Bar Plot)
56
Frequency Table
57
Contingency Table
58
Binomial Classification Metrics
59
Confusion Matrix
60
ROC Analysis
61
Stem-and-Leaf Plot
62
Histogram
63
Data Quality Forensics
64
Quantiles
65
Central Tendency
66
Variability
67
Skewness & Kurtosis
68
Concentration
69
Notched Boxplot
70
Scatterplot
71
Pearson Correlation
72
Rank Correlation
73
Partial Pearson Correlation
74
Simple Linear Regression
75
Moments
76
Quantile-Quantile Plot (QQ Plot)
77
Normal Probability Plot
78
Probability Plot Correlation Coefficient Plot (PPCC Plot)
79
Box-Cox Normality Plot
80
Kernel Density Estimation
81
Bivariate Kernel Density Plot
82
Conditional EDA: Panel Diagnostics
83
Bootstrap Plot (Central Tendency)
84
Survey Scores Rank Order Comparison
85
Cronbach Alpha
86
Equi-distant Time Series
87
Time Series Plot (Run Sequence Plot)
88
Mean Plot
89
Blocked Bootstrap Plot (Central Tendency)
90
Standard Deviation-Mean Plot
91
Variance Reduction Matrix
92
(Partial) Autocorrelation Function
93
Periodogram & Cumulative Periodogram
94
Problems
Hypothesis Testing
95
Normal Distributions revisited
96
The Population
97
The Sample
98
The One-Sided Hypothesis Test
99
The Two-Sided Hypothesis Test
100
When to use a one-sided or two-sided test?
101
What if
\(\sigma\)
is unknown?
102
The Central Limit Theorem (revisited)
103
Statistical Test of the Population Mean with known Variance
104
Statistical Test of the Population Mean with unknown Variance
105
Statistical Test of the Variance
106
Statistical Test of the Population Proportion
107
Statistical Test of the Standard Deviation
\(\sigma\)
108
Statistical Test of the difference between Means -- Independent/Unpaired Samples
109
Statistical Test of the difference between Means -- Dependent/Paired Samples
110
Statistical Test of the difference between Variances -- Independent/Unpaired Samples
111
Hypothesis Testing for Research Purposes
112
Decision Thresholds, Alpha, and Confidence Levels
113
Bayesian Inference for Decision-Making
114
One Sample t-Test
115
Skewness & Kurtosis Tests
116
Paired Two Sample t-Test
117
Wilcoxon Signed-Rank Test
118
Unpaired Two Sample t-Test
119
Unpaired Two Sample Welch Test
120
Two One-Sided Tests (TOST) for Equivalence
121
Mann-Whitney U test (Wilcoxon Rank-Sum Test)
122
Bayesian Two Sample Test
123
Median Test based on Notched Boxplots
124
Chi-Squared Tests for Count Data
125
Kolmogorov-Smirnov Test
126
One Way Analysis of Variance (1-way ANOVA)
127
Kruskal-Wallis Test
128
Two Way Analysis of Variance (2-way ANOVA)
129
Repeated Measures ANOVA
130
Friedman Test
131
Testing Correlations
132
A Note on Causality
133
Problems
Regression Models
134
Simple Linear Regression Model (SLRM)
135
Multiple Linear Regression Model (MLRM)
136
Logistic Regression
137
Generalized Linear Models
138
Multinomial and Ordinal Logistic Regression
139
Cox Proportional Hazards Regression
140
Conditional Inference Trees
141
Leaf Diagnostics for Conditional Inference Trees
142
Hypothesis Testing with Linear Regression Models (from a Practical Point of View)
143
Problems
Introduction to Time Series Analysis
144
Case: the Market of Health and Personal Care Products
145
Decomposition of Time Series
146
Ad hoc Forecasting of Time Series
Box-Jenkins Analysis
147
Introduction to Box-Jenkins Analysis
148
Theoretical Concepts
149
Stationarity
150
Identifying ARMA parameters
151
Estimating ARMA Parameters and Residual Diagnostics
152
Forecasting with ARIMA models
153
Intervention Analysis
154
Cross-Correlation Function
155
Transfer Function Noise Models
156
General-to-Specific Modeling
References
Appendices
Appendices
A
Method Selection Guide
B
Presentations and Teaching Materials
C
R Language Concepts for Statistical Computing
D
Matrix Algebra
E
Standard Normal Table (Gaussian Table)
F
Critical values of Student’s
\(t\)
distribution with
\(\nu\)
degrees of freedom
G
Upper-tail critical values of the
\(\chi^2\)
-distribution with
\(\nu\)
degrees of freedom
H
Lower-tail critical values of the
\(\chi^2\)
-distribution with
\(\nu\)
degrees of freedom
DRAFT
This draft is under development — DO NOT CITE OR SHARE.
Appendices
B
Presentations and Teaching Materials
Appendix B — Presentations and Teaching Materials
Use the slide decks below for class sessions and review.
Introduction to Probability
Introduction to Distributions
Descriptive Statistics and EDA (Lecture 1)
Descriptive Statistics and EDA (Lecture 2)
A
Method Selection Guide
C
R Language Concepts for Statistical Computing