58 Binomial Classification Metrics
58.1 Definition
Binomial Classification Metrics are summary statistics that are used to measure the predictive quality of binary classification models based on one or multiple rules. The purpose of these models is to predict whether or not an element of the dataset belongs to one group (A) or another (B). Each prediction that is made, can be true (correct) or false (erroneous).
In earlier chapters, we introduced Bayes’ Theorem (Chapter 7), Sensitivity and Specificity (Chapter 8), and the Multinomial Naive Bayes Classifier (Chapter 9). Once a classifier has been constructed, we need a way to evaluate its predictive performance. This is where the Contingency Table (Chapter 57) becomes particularly useful: by cross-tabulating actual outcomes against predicted outcomes, we obtain a summary known as the Confusion Matrix (Chapter 59).
58.2 Example
Suppose that we have computed a series of predictions based on a Binomial Classification Model that is used for fraud detection of payment transactions. Table 58.1 shows that four predictions (out of seven) are correct (i.e. the predictions for transactions 2, 3, 4, and 6). The remaining ones are erroneous because the predicted values do not correspond with the actual situation.
| Transaction | Is Fraudulent? | Prediction |
|---|---|---|
| 1 | No | Yes |
| 2 | Yes | Yes |
| 3 | No | No |
| 4 | No | No |
| 5 | Yes | No |
| 6 | No | No |
| 7 | Yes | No |
The prediction outcomes in Table 58.1 have been displayed for a limited number of transactions. In practice, however, such tables are likely to contain a large number of rows which makes it rather difficult to interpret the predictive performance of the model. Hence, we rely on the Contingency Table structure to create a summary—the Confusion Matrix—that is easy to interpret and communicate.