Measurement scales and their use in education


When we want to conduct a study in the field of education, we find variables whose quantification is not possible or directly realizable. This is the case, for example, of a person’s sex, motivation or attitude towards studies. According to the nature of data, a measurement scale is the instrument that allows to classify them, order them, and even perform statistical tests for their analysis. How are these scales used?

The first step is their creation, for which it is necessary to know what type of variable we are going to deal with. Thus, scales are classified into nominal, ordinal, interval or ratio:

  • Nominal scales deal with variables that can only be classified (e.g., sex: male or female).
  • Ordinal scales deal with variables that cannot only be classified, but also ordered (e.g., education level: primary school, high school or higher education).
  • Interval scales deal with relative numerical variables (e.g., a person’s IQ), although they also allow managing ordinal variables through a rating system (e.g., an opinion rated from 1 to 5).
  • Ratio scales deal with absolute numerical variables, that is, those having an origin for the scale (e.g., age).

In education studies, the maximum degree of quantification that can be reached with a measure is usually an interval. This happens, for instance, when we want to know the behavior of a group of subjects or measure their attitude towards a certain situation or object. In these cases, the use of questionnaires composed of different statements or items which the subject must rate is very frequent. For this, the behavior or attitude is considered as a continuum that goes from favorable to unfavorable in an estimation scale that is normally of the Likert type. The typical assessment format usually has five levels:

  1. Strongly disagree
  2. Disagree
  3. Neither agree nor disagree
  4. Agree
  5. Strongly agree

The article “Medición de actitudes en estudios sobre educación” shows the process of developing and validating this type of measurement tool. Once the measurement questionnaire is created, the question is: What to do with the data?

  • When dealing with categorical variables (in nominal or ordinal scales), as descriptive measures, only the calculation of percentages is usually useful, since values ​​such as median or mode contribute little. For example, to describe the education level according to the sex of a population, it is appropriate to make a table where both variables are compared and the percentages of all the possible combinations among categories emerge. As for statistical tests, there is a set of applicable tests, called “non-parametric tests”, which are characterized by not assuming any statistical distribution for variables. This is the case, for example, of the Kruskal-Wallis test, a version by ranges of the widely used analysis of variance (ANOVA), with which it is possible to analyze possible relationships among variables. In the example considered, it could be studied whether there are differences in the education level between men and women.
  • When variables are numerical (interval or ratio), it is possible to calculate average values ​​and standard deviations as descriptive measures. For example, one could calculate the average IQ in a population or break it down by age group. In terms of statistical tests, both “non-parametric” and “parametric” tests can be performed. The latter assume statistical distributions underlying data, making them more powerful (their ability to reject hypotheses is greater), which is why they are the most frequently used. One of them is the aforementioned ANOVA. The only precaution that must be taken when applying them is that certain validity conditions must be met. Using the ANOVA, it could be studied, for instance, whether IQ is a characteristic that varies with age.

Although it seems clear which descriptive measures and statistical tests can be used with each type of variable, the Likert-type scales mentioned above are a special case. In effect, in this scale, an ordinal variable becomes numerical by means of an “artificial” rating system. The possible use of parametric statistics on Likert scales is an issue that reappears every so often in research journals, being the subject of much controversy.

There are many who assert that the intervals obtained by this scale cannot be presumed equal and, therefore, they are ordinal variables in which parametric tests have no place. It is argued that, even assuming that intervals were equal, the data obtained have probability distributions that do not meet the applicability conditions of parametric tests. However, other researchers have used both types of tests on data from Likert-type scales and compared the results, stating that, even if validity assumptions are not met, parametric tests are applicable.

Some defend their use compared to non-parametric tests because the latter tend to lose information and are less powerful, requiring stronger evidence to draw conclusions. In fields such as health sciences or education, the controversy has even reached the editorial boards of some scientific journals. Some studies have set the percentage of educational research works whose publication could have been rejected under the argument of having used parametric methods with Likert scales to 75%.

Beyond the controversy, we hope that this brief article has shown the measurement tools that researchers in the field of education have at their disposal. Their knowledge is a first step to carry out quality studies with publication guarantees.

Laura Moreno-Delgado

Engineer, educator, and personal counselor, but also passionate about yoga, of which she is an instructor.

Manuel Solaguren-Beascoa

Doctor of Engineering, professor and researcher at the University of Burgos, Spain. Now, Manuel works together with Laura in a project to improve the tutoring action in the field of engineering programs. Email: