• Descriptive
    • Moments
    • Concentration
    • Central Tendency
    • Variability
    • Stem-and-Leaf Plot
    • Histogram & Frequency Table
    • Data Quality Forensics
    • Conditional EDA
    • Quantiles
    • Kernel Density Estimation
    • Normal QQ Plot
    • Bootstrap Plot

    • Multivariate Descriptive Statistics
  • Distributions
    • Binomial Probabilities
    • Geometric Probabilities
    • Negative Binomial Probabilities
    • Hypergeometric Probabilities
    • Multinomial Probabilities
    • Dirichlet
    • Poisson Probabilities

    • Exponential
    • Gamma
    • Erlang
    • Weibull
    • Rayleigh
    • Maxwell-Boltzmann
    • Lognormal
    • Pareto
    • Inverse Gamma
    • Inverse Chi-Square

    • Beta
    • Power
    • Beta Prime (Inv. Beta)
    • Triangular

    • Normal (area)
    • Logistic
    • Laplace
    • Cauchy (standard)
    • Cauchy (location-scale)
    • Gumbel
    • Fréchet
    • Generalized Extreme Value

    • Normal RNG
    • ML Fitting
    • Tukey Lambda PPCC
    • Box-Cox Normality Plot
    • Noncentral t
    • Noncentral F
    • Sample Correlation r

    • Empirical Tests
  • Hypotheses
    • Theoretical Aspects of Hypothesis Testing
    • Bayesian Inference
    • Minimum Sample Size

    • Empirical Tests
    • Multivariate (pair-wise) Testing
  • Models
    • Manual Model Building
    • Guided Model Building
  • Time Series
    • Time Series Plot
    • Decomposition
    • Exponential Smoothing

    • Blocked Bootstrap Plot
    • Mean Plot
    • (P)ACF
    • VRM
    • Standard Deviation-Mean Plot
    • Spectral Analysis
    • ARIMA

    • Cross Correlation Function
    • Granger Causality
  1. Descriptive Statistics & Exploratory Data Analysis
  2. 84  Survey Scores Rank Order Comparison
  • Preface
  • Getting Started
    • 1  Introduction
    • 2  Why Do We Need Innovative Technology?
    • 3  Basic Definitions
    • 4  The Big Picture: Why We Analyze Data
  • Introduction to Probability
    • 5  Definitions of Probability
    • 6  Jeffreys’ axiom system
    • 7  Bayes’ Theorem
    • 8  Sensitivity and Specificity
    • 9  Naive Bayes Classifier
    • 10  Law of Large Numbers

    • 11  Problems
  • Probability Distributions
    • 12  Bernoulli Distribution
    • 13  Binomial Distribution
    • 14  Geometric Distribution
    • 15  Negative Binomial Distribution
    • 16  Hypergeometric Distribution
    • 17  Multinomial Distribution
    • 18  Poisson Distribution

    • 19  Uniform Distribution (Rectangular Distribution)
    • 20  Normal Distribution (Gaussian Distribution)
    • 21  Gaussian Naive Bayes Classifier
    • 22  Chi Distribution
    • 23  Chi-squared Distribution (1 parameter)
    • 24  Chi-squared Distribution (2 parameters)
    • 25  Student t-Distribution
    • 26  Fisher F-Distribution
    • 27  Exponential Distribution
    • 28  Lognormal Distribution
    • 29  Gamma Distribution
    • 30  Beta Distribution
    • 31  Weibull Distribution
    • 32  Pareto Distribution
    • 33  Inverse Gamma Distribution
    • 34  Rayleigh Distribution
    • 35  Erlang Distribution
    • 36  Logistic Distribution
    • 37  Laplace Distribution
    • 38  Gumbel Distribution
    • 39  Cauchy Distribution
    • 40  Triangular Distribution
    • 41  Power Distribution
    • 42  Beta Prime Distribution
    • 43  Sample Correlation Distribution
    • 44  Dirichlet Distribution
    • 45  Generalized Extreme Value (GEV) Distribution
    • 46  Frechet Distribution
    • 47  Noncentral t Distribution
    • 48  Noncentral F Distribution
    • 49  Inverse Chi-Squared Distribution
    • 50  Maxwell-Boltzmann Distribution
    • 51  Distribution Relationship Map

    • 52  Problems
  • Descriptive Statistics & Exploratory Data Analysis
    • 53  Types of Data
    • 54  Datasheets

    • 55  Frequency Plot (Bar Plot)
    • 56  Frequency Table
    • 57  Contingency Table
    • 58  Binomial Classification Metrics
    • 59  Confusion Matrix
    • 60  ROC Analysis

    • 61  Stem-and-Leaf Plot
    • 62  Histogram
    • 63  Data Quality Forensics
    • 64  Quantiles
    • 65  Central Tendency
    • 66  Variability
    • 67  Skewness & Kurtosis
    • 68  Concentration
    • 69  Notched Boxplot
    • 70  Scatterplot
    • 71  Pearson Correlation
    • 72  Rank Correlation
    • 73  Partial Pearson Correlation
    • 74  Simple Linear Regression
    • 75  Moments
    • 76  Quantile-Quantile Plot (QQ Plot)
    • 77  Normal Probability Plot
    • 78  Probability Plot Correlation Coefficient Plot (PPCC Plot)
    • 79  Box-Cox Normality Plot
    • 80  Kernel Density Estimation
    • 81  Bivariate Kernel Density Plot
    • 82  Conditional EDA: Panel Diagnostics
    • 83  Bootstrap Plot (Central Tendency)
    • 84  Survey Scores Rank Order Comparison
    • 85  Cronbach Alpha

    • 86  Equi-distant Time Series
    • 87  Time Series Plot (Run Sequence Plot)
    • 88  Mean Plot
    • 89  Blocked Bootstrap Plot (Central Tendency)
    • 90  Standard Deviation-Mean Plot
    • 91  Variance Reduction Matrix
    • 92  (Partial) Autocorrelation Function
    • 93  Periodogram & Cumulative Periodogram

    • 94  Problems
  • Hypothesis Testing
    • 95  Normal Distributions revisited
    • 96  The Population
    • 97  The Sample
    • 98  The One-Sided Hypothesis Test
    • 99  The Two-Sided Hypothesis Test
    • 100  When to use a one-sided or two-sided test?
    • 101  What if \(\sigma\) is unknown?
    • 102  The Central Limit Theorem (revisited)
    • 103  Statistical Test of the Population Mean with known Variance
    • 104  Statistical Test of the Population Mean with unknown Variance
    • 105  Statistical Test of the Variance
    • 106  Statistical Test of the Population Proportion
    • 107  Statistical Test of the Standard Deviation \(\sigma\)
    • 108  Statistical Test of the difference between Means -- Independent/Unpaired Samples
    • 109  Statistical Test of the difference between Means -- Dependent/Paired Samples
    • 110  Statistical Test of the difference between Variances -- Independent/Unpaired Samples

    • 111  Hypothesis Testing for Research Purposes
    • 112  Decision Thresholds, Alpha, and Confidence Levels
    • 113  Bayesian Inference for Decision-Making
    • 114  One Sample t-Test
    • 115  Skewness & Kurtosis Tests
    • 116  Paired Two Sample t-Test
    • 117  Wilcoxon Signed-Rank Test
    • 118  Unpaired Two Sample t-Test
    • 119  Unpaired Two Sample Welch Test
    • 120  Two One-Sided Tests (TOST) for Equivalence
    • 121  Mann-Whitney U test (Wilcoxon Rank-Sum Test)
    • 122  Bayesian Two Sample Test
    • 123  Median Test based on Notched Boxplots
    • 124  Chi-Squared Tests for Count Data
    • 125  Kolmogorov-Smirnov Test
    • 126  One Way Analysis of Variance (1-way ANOVA)
    • 127  Kruskal-Wallis Test
    • 128  Two Way Analysis of Variance (2-way ANOVA)
    • 129  Repeated Measures ANOVA
    • 130  Friedman Test
    • 131  Testing Correlations
    • 132  A Note on Causality

    • 133  Problems
  • Regression Models
    • 134  Simple Linear Regression Model (SLRM)
    • 135  Multiple Linear Regression Model (MLRM)
    • 136  Logistic Regression
    • 137  Generalized Linear Models
    • 138  Multinomial and Ordinal Logistic Regression
    • 139  Cox Proportional Hazards Regression
    • 140  Conditional Inference Trees
    • 141  Leaf Diagnostics for Conditional Inference Trees
    • 142  Conditional Random Forests
    • 143  Hypothesis Testing with Linear Regression Models (from a Practical Point of View)

    • 144  Problems
  • Introduction to Time Series Analysis
    • 145  Case: the Market of Health and Personal Care Products
    • 146  Decomposition of Time Series
    • 147  Ad hoc Forecasting of Time Series
  • Box-Jenkins Analysis
    • 148  Introduction to Box-Jenkins Analysis
    • 149  Theoretical Concepts
    • 150  Stationarity
    • 151  Identifying ARMA parameters
    • 152  Estimating ARMA Parameters and Residual Diagnostics
    • 153  Forecasting with ARIMA models
    • 154  Intervention Analysis
    • 155  Cross-Correlation Function
    • 156  Transfer Function Noise Models
    • 157  General-to-Specific Modeling
  • Model Building Strategies
    • 158  Introduction to Model Building Strategies
    • 159  Manual Model Building
    • 160  Model Validation
    • 161  Regularization Methods
    • 162  Hyperparameter Optimization Strategies
    • 163  Guided Model Building in Practice
    • 164  Diagnostics, Revision, and Guided Forecasting
    • 165  Leakage, Target Encoding, and Robust Regression
  • References
  • Appendices
    • Appendices
    • A  Method Selection Guide
    • B  Presentations and Teaching Materials
    • C  R Language Concepts for Statistical Computing
    • D  Matrix Algebra
    • E  Standard Normal Table (Gaussian Table)
    • F  Critical values of Student’s \(t\) distribution with \(\nu\) degrees of freedom
    • G  Upper-tail critical values of the \(\chi^2\)-distribution with \(\nu\) degrees of freedom
    • H  Lower-tail critical values of the \(\chi^2\)-distribution with \(\nu\) degrees of freedom

Table of contents

  • 84.1 Definition
  • 84.2 R Module
    • 84.2.1 Public website
    • 84.2.2 RFC
  • 84.3 Purpose
  • 84.4 Pros & Cons
    • 84.4.1 Pros
    • 84.4.2 Cons
  • 84.5 Example
  • 84.6 Task
  1. Descriptive Statistics & Exploratory Data Analysis
  2. 84  Survey Scores Rank Order Comparison

84  Survey Scores Rank Order Comparison

84.1 Definition

The Survey Scores Rank Order Comparison (SSROC) attempts to assess whether or not a series of survey scores for similar questions1 (which are based on a Likert scale2) can be treated as quasi-interval variables (i.e. quantitative variables for which it is possible to compute quantitative statistics, such as the Arithmetic Mean).

In a first step, the SSROC computes the following statistics:

  • The Arithmetic Average \(\bar{X}\) of the Likert Scores for each item (i.e. question).
  • The sum of positive scores \(P_s\) for the Likert scores, after subtracting the midpoint of the Likert scale.
  • The absolute sum of negative scores \(N_s\) for the Likert scores, after subtracting the midpoint of the Likert scale.
  • The count of positive scores \(P_c\) for the Likert scores, after subtracting the midpoint of the Likert scale.
  • The count of negative scores \(N_c\) for the Likert scores, after subtracting the midpoint of the Likert scale.
  • The sum-based statistic \(A_s = \frac{P_s - N_s}{P_s + N_s}\).
  • The count-based statistic \(A_c = \frac{P_c - N_c}{P_c + N_c}\).

For instance, if the 5-point Likert scale \(X = [4, 3, 1]\) then the following values are obtained:

  • \(\bar{X} = \frac{4 + 3 + 1}{3} \simeq 2.667\).
  • \(P_s = 1\) because the sum of positive values in \(X - \text{midpoint of Likert scale} = [4 - 3, 3 - 3, 1 - 3] = [1, 0, -2]\) is 1.
  • \(N_s = 2\) because the absolute sum of negative values in \(X - \text{midpoint of Likert scale} = [4 - 3, 3 - 3, 1 - 3] = [1, 0, -2]\) is 2.
  • \(P_c = 1\) because the count of positive values in \([1, 0, -2]\) is 1.
  • \(N_c = 1\) because the count of negative values in \([1, 0, -2]\) is 1.
  • The sum-based average \(A_s = \frac{1 - 2}{1 + 2} = \frac{-1}{3}\).
  • The count-based average \(A_c = \frac{1 - 1}{1 + 1} = 0\).

The Arithmetic Mean should only be used if the underlying Likert scores \(X\) can be interpreted as a (truly) quantitative variable (rather than an ordinal variable). The count-based average \(A_c\), however, can always be used without problems (even if the Likert scores can not be interpreted as a quantitative variable). The sum-based average \(A_s\) is somewhere in-between.

Hence, if all these statistics are computed for similar questions it is possible to assess whether or not the three statistics (i.e. \(\bar{X}\), \(A_s\), and \(A_c\)) preserve their rank orders. If the rank order is preserved (i.e. the Rank Order Correlation is close to +1) then the Likert scores can be interpreted as quantitative variables (because the result of ranking similar questions does not depend on which statistic is used). However, if the rank orders are not strongly correlated then it is certainly not wise to treat the Likert scores as a truly quantitative variable. As a practical guideline, Rank Order Correlations above 0.9 suggest that the three statistics agree and the Arithmetic Mean can be used with reasonable confidence; lower correlations indicate that the ordinal nature of the scale should be respected.

84.2 R Module

84.2.1 Public website

The SSROC is available on the public website:

  • https://compute.wessa.net/rwasp_surveyscores.wasp

84.2.2 RFC

The SSROC is also available in RFC under the “Descriptive / Multivariate Descriptive Statistics” menu item.

To compute the SSROC on your local machine, the following script can be used in the R console:

x <- array(round(runif(241*20, 1, 5)), dim=c(241,20), 
           dimnames=list(1:241, c('A1','A2','A3','A4','A5','A6','A7','A8','A9','A10',
                                  'A11','A12','A13','A14','A15','A16','A17','A18','A19','A20')))
par1 = '1 2 3 4 5'
docor <- function(x,y,method) {
  r <- cor.test(x,y,method=method)
  paste(round(r$estimate,3),' (',round(r$p.value,3),')',sep='')
}
nx <- length(x[,1])
cx <- length(x[1,])
mymedian <- median(as.numeric(strsplit(par1,' ')[[1]]))
myresult <- array(NA, dim = c(cx,7))
rownames(myresult) <- paste('Q',1:cx,sep='')
colnames(myresult) <- c('mean',
                        'Sum of pos (Ps)',
                        'Sum of neg (Ns)', 
                        '(Ps-Ns)/(Ps+Ns)', 
                        'Count of pos (Pc)', 
                        'Count of neg (Nc)', 
                        '(Pc-Nc)/(Pc+Nc)')
for (i in 1:cx) {
  spos <- 0
  sneg <- 0
  cpos <- 0
  cneg <- 0
  for (j in 1:nx) {
    if (!is.na(x[j,i])) {
      myx <- as.numeric(x[j,i]) - mymedian
      if (myx > 0) {
        spos = spos + myx
        cpos = cpos + 1
      }
      if (myx < 0) {
        sneg = sneg + abs(myx)
        cneg = cneg + 1
      }
    }
  }
  myresult[i,1] <- round(mean(as.numeric(x[,i]),na.rm=T)-mymedian,2)
  myresult[i,2] <- spos
  myresult[i,3] <- sneg
  myresult[i,4] <- round((spos - sneg) / (spos + sneg),2)
  myresult[i,5] <- cpos
  myresult[i,6] <- cneg
  myresult[i,7] <- round((cpos - cneg) / (cpos + cneg),2)
}
print(myresult)

cat("\nPearson correlations of survey scores\n")
cor(myresult[,c(1,4,7)], method = "pearson")
cat("\nKendall rank correlations of survey scores\n")
cor(myresult[,c(1,4,7)], method = "kendall")
     mean Sum of pos (Ps) Sum of neg (Ns) (Ps-Ns)/(Ps+Ns) Count of pos (Pc)
Q1  -0.02             112             116           -0.02                83
Q2  -0.15             103             140           -0.15                80
Q3  -0.08             123             142           -0.07                91
Q4  -0.04             110             120           -0.04                87
Q5  -0.13             105             136           -0.13                75
Q6   0.13             146             114            0.12               103
Q7  -0.06             118             132           -0.06                91
Q8   0.05             123             111            0.05                93
Q9   0.04             118             109            0.04                95
Q10 -0.07             112             129           -0.07                83
Q11  0.12             131             101            0.13                95
Q12  0.07             134             117            0.07               103
Q13  0.16             141             102            0.16                99
Q14  0.10             132             107            0.10               100
Q15  0.00             115             116            0.00                82
Q16 -0.06             110             125           -0.06                87
Q17 -0.02             111             117           -0.03                86
Q18 -0.11              96             122           -0.12                72
Q19 -0.11             114             140           -0.10                82
Q20  0.00             124             124            0.00                94
    Count of neg (Nc) (Pc-Nc)/(Pc+Nc)
Q1                 90           -0.04
Q2                107           -0.14
Q3                101           -0.05
Q4                 93           -0.03
Q5                103           -0.16
Q6                 86            0.09
Q7                 98           -0.04
Q8                 86            0.04
Q9                 82            0.07
Q10               103           -0.11
Q11                84            0.06
Q12                87            0.08
Q13                81            0.10
Q14                83            0.09
Q15                92           -0.06
Q16                95           -0.04
Q17                93           -0.04
Q18                95           -0.14
Q19               102           -0.11
Q20                95           -0.01

Pearson correlations of survey scores
                     mean (Ps-Ns)/(Ps+Ns) (Pc-Nc)/(Pc+Nc)
mean            1.0000000       0.9981090       0.9471695
(Ps-Ns)/(Ps+Ns) 0.9981090       1.0000000       0.9488812
(Pc-Nc)/(Pc+Nc) 0.9471695       0.9488812       1.0000000

Kendall rank correlations of survey scores
                     mean (Ps-Ns)/(Ps+Ns) (Pc-Nc)/(Pc+Nc)
mean            1.0000000       0.9812368       0.8229647
(Ps-Ns)/(Ps+Ns) 0.9812368       1.0000000       0.8207613
(Pc-Nc)/(Pc+Nc) 0.8229647       0.8207613       1.0000000

To compute the SSROC, the R code iterates over all columns of the multivariate dataset and computes the various statistics (see column names of myresult). In addition, the Pearson and Kendall correlation matrices are also computed.

84.3 Purpose

The purpose of the SSROC is to determine whether Likert scores of similar items in a survey can be treated as quantitative variables (which allows one to compute the Arithmetic Mean and other quantitative statistics). If this is not the case, the data should be properly categorized and/or treated as purely qualitative data.

84.4 Pros & Cons

84.4.1 Pros

The SSROC has the following advantages:

  • It allows one to assess the appropriateness of quantitative statistics (such as the Arithmetic Mean) for Likert scores of similar items in a survey.
  • It is easy to compute the alternative averages \(A_s\) and \(A_c\).
  • The interpretation of \(A_c\) is very easy and has the potential to be very informative (it can always be applied and it does not depend on the neutral scores).

84.4.2 Cons

The SSROC has the following disadvantages:

  • Most readers do not know about the SSROC.
  • Most researchers simply compute quantitative statistics for Likert scores and do not want to be bothered about the validity of doing so.
  • It can only be used if there are sufficient items that can be compared in the rank order comparison.

84.5 Example

The following analysis shows the three statistics (\(\bar{X}\), \(A_s\), and \(A_c\)) for 10 similar items based on 7-point Likert scores. The statistics \(\bar{X}\) and \(\frac{P_s-N_s}{P_s+N_s}\) are scaled between -3 and 3 (because we use a 7-point Likert score). The statistic \(\frac{P_c-N_c}{P_c+N_c}\) always has a minimum of -1 and a maximum of 1.

Interactive Shiny app (click to load).
Open in new tab

Observe how, for instance, the first item has a count-based average score of 0.59 (this is 59% of the maximum) while the Arithmetic Mean is only 0.86 (for a maximum of +3). The count-based score does not only apply to ordinal as well as quantitative variables, it also ignores the neutral scores of the survey!

84.6 Task

In the previous example, change the scale and examine what happens with the scores. Do you see why it is important to set the correct Likert scale?


  1. In this context, similar questions are defined as questions which attempt to measure the same underlying opinion but with different phrases. Surveys often contain similar questions in order to improve the validity of the survey.↩︎

  2. A Likert scale represents an ordinal measurement. For instance, a 5-point Likert scale assigns the values 1, 2, 3, 4, and 5 to represent the degree to which the respondent agrees with a specified statement (i.e. 1 = totally disagree and 5 = totally agree).↩︎

83  Bootstrap Plot (Central Tendency)
85  Cronbach Alpha

© 2026 Patrick Wessa. Provided as-is, without warranty.

Feedback: e-mail | Anonymous contributions: click to copy (Sats) | click to copy (XMR)

Cookie Preferences