53 Types of Data

53.1 Qualitative Data

This type of data is used to categorize or describe phenomena that can be observed and are represented by numbers or text labels. If Qualitative Data is represented in the form of numbers then they cannot be used in arithmetic calculations. For instance, if we categorize students to be in group 1, 2, or 3, it does not make sense to compute the mean of the group numbers. The only calculation that can be performed with Qualitative Data is counting (i.e. we can count the number of students in each group).

There are many situations where Qualitative Data might be of interest. These are a few examples:

gender
hair color (blond, brown, black, grey, etc.)
favorite browser (FireFox, Chrome, Safari, Internet Explorer, etc.)

53.2 Quantitative Data

This type of data is used when a property of some observable item is measured on a numerical scale. Quantitative Data are always expressed as numbers and can be used in arithmetic calculations. Here are a few examples:

height of a person
exam score
time spent to perform a task

Quantitative Data can be subdivided in two types: Discrete Quantitative Data and Continuous Quantitative Data. The difference between these types is not always clear because in practice it is (almost) never the case that a property can be measured on a scale that is truly continuous. When we talk about Continuous Quantitative Data we often mean measurements that are “treated as” being measured on a continuous scale.

53.3 Scales of Measurement

In statistics education, it is common to introduce the so-called Stevens’ scales of measurement (Stevens 1946), which distinguish four levels:

Nominal: categories with no inherent order (e.g., hair color, gender, browser preference)
Ordinal: ordered categories where the intervals between values are not necessarily equal (e.g., rankings, education level, survey responses)
Interval: numerical scale with equal intervals but no true zero point (e.g., temperature in Celsius, calendar years)
Ratio: numerical scale with equal intervals and a true zero point (e.g., height, weight, duration)

While this taxonomy is useful as a starting point, it should not be applied rigidly. The theoretical classification of a scale does not determine which statistical methods are appropriate—the actual distribution of the observed data does.

53.3.1 Example: Likert Scales

In social research it is often the case that surveys are scored based on a so-called Likert scale (Likert 1932). This is a numerical scale which often corresponds to the degree that one agrees with a statement. For instance, a statement such as “The software is easy to use” could be scored on a 5-point Likert scale with the following answers: “completely agree”, “agree”, “neither agree nor disagree”, “disagree”, “completely disagree”.

If we assign a number to each answer (“completely agree” = 5, “agree” = 4, …, “completely disagree” = 1) then we obtain observations that look like Quantitative Data. According to Stevens’ taxonomy, this would be classified as ordinal data because we cannot assume the differences between adjacent scores are of equal importance.

However, this classification alone does not tell us which methods to use. Consider a dataset where all respondents answered either 1 or 5, with no values in between. Although the scale is theoretically a 5-point ordinal scale, the observed data is de facto binary. In this case, methods for binary or dichotomous data would be more appropriate than methods designed for ordinal scales.

The practical lesson is this: always examine the empirical distribution of your data before selecting an analysis method. The theoretical scale provides a starting point, but the actual variability and distribution of the observations should guide the choice of statistical procedures.