A main concept in statistics is level of measurement of variables.

It's important and normally taught in the first week in every intro stats class.

But even something so fundamental can be tricky when you start working with real data. The same variable can be considered to have different levels of measurement in different situations. It sounded like an absolute in the intro stats class because your professor didn't want to confuse beginning students.

But now that you're a more sophisticated practitioner of data analysis, I will show you how the same variable can be considered to have different levels of measurement. But first, let me review some definitions.

### A review of the level of measure up of Variables

Nominal:

Unordered categorical variables. These can be either binary (only two categories, like gender: male or female) or multinomial (more than two categories, like marital status: married, divorced, never married, widowed, separated). The key thing here is that there is no reasonable order to the categories.

Ordinal:

Ordered categories. Tho categorical, yet in one order. Likert items with responses like: “Never, Sometimes, Often, Always” space ordinal.

Interval:

Numerical values without a true zero point. The idea here is that intervals between the values are equal and meaningful, but the numbers themselves are arbitrary. 0 does not indicate a complete lack of the quantity being measured. IQ and degrees Celsius or Fahrenheit are both interval.

Ratio:

Numerical values with a true zero point.

Interval and Ratio variables can be further split into two types: discrete and continuous. Discrete variables, like counts, can only take on whole numbers: number of children in a family, number of days missed from work. Continuous variables can take on any number, even beyond the decimal point.

Not always obvious is that these levels of measurement are not only about the variable itself. Also important are the meaning of the variable within the research context and how it was measured.

### An Example: Age

A good example of this is a variable like age. Age is, technically, continuous and ratio. A person's age does, after all, have a meaningful zero point (birth) and is continuous if you measure it precisely enough. It is meaningful to say that someone (or something) is 7.28 years old.

That said, you might not be able to treat it as continuous in your analysis. It depends on how you measure it and whether there are qualitative implications about age in your study context. Here are 5 examples in which age has another level of measurement:

Age as Ordinal

For example, it's not unusual to give people age categories as possible responses on a survey. Common reasons are that people don't want to reveal their actual age or because they don't remember the actual age at which some event occurred.

I worked with a client whose dependent variable was the age at which adult smokers started smoking. It would have been nice to get an accurate date on which each person smoked their first cigarette, but it's a big burden on respondents to ask them a very specific number from a long time ago.

Rather than have respondents guess inaccurately or leave the answer blank, the researchers gave them a set of ordered age categories: 0 to 10, 11-12, 13-15, 16-17, etc. They gave up precision to gain accuracy.

Ordinal response variables require a design like one Ordinal Logistic Regression.

Age as Discrete Counts

Likewise, a continuous variable may be calculated discrete because of the way people think about and measure it.

For example, consider the example of age measured in days on which germinated seeds of a certain species begin to sprout leaves. Many will do so within a few days, and it may range from 2-9 days.

In this context, age is certainly a discrete count—the number of days. If it is used as outcome variable, a Poisson (or related) regression would be appropriate, not a linear model.

Age as Multinomial

Sometimes numerical variables room rendered categorical because of the absence of values.

In one study I analyzed, the key independent variable was the age of a juror in a trial. If technically, ages are continuous, in this study there were only 4 values: 49, 69, 79 and 89.

So even though one could use statistics that treated this variable as continuous, they don't make a lot of sense. In a linear model, if you treat this age variable as a numeric predictor, the model will fit a regression line across these four ages. If you treat it as categorical, it will estimate means and allow you to compare the average of Y at each age.

The effect of age in this context is far better measured through a difference in the mean of Y in ~ two different ages than with a slope—the distinction in Y for each one year increase.

Now if her multinomial period variable is the response, you’ll require a multinomial logistic regression.

Age as Binary Categories

In a similar example, a researcher was examining math abilities in first grade children. The vital independent variable to be whether the child had actually reached a specific cognitive developmental milestone and the dependent variable was math score. Age was a manage variable and it was mildly associated to, however not confounded with, attainment the the milestone.

Because each child was asked how old they were, it was measured in whole years. It would have been ideal to collect more specific data on ages—such as their birth dates from their parents or school records. For whatever reason, that wasn't possible.

So the only two worths for age were 6 and also 7. So similar to in the last example, it only made sense to law this predictor variable together categorical in the analysis.

If you had a binary outcome variable, you’d most likely need a binary logistic regression.

Age as Binary category (another one)

In a research comparing the work-life balance that men and women, the result variable was variety of hours functioned per week. One an essential predictor because that women, however not men, was the age of their youngest child.

There is a qualitative difference between a 5 year old, who may only be eligible because that part-time kindergarten and a 6 year old, that is old sufficient to go to permanent school.

This qualitative difference exists in this context between 5 and 6 that doesn't exist at other one-year age differences*. This qualitative distinction is in fact the most important feature of the youngest child's age. Treating age as continuous actually ignores this important qualitative difference.

Notice the both of this binary instances are an extremely different case from act a median break-up on a consistent variable.

That kind of categorizing isn't a good idea because you're throwing away good information based on an arbitrary cutoff.

*It additionally doesn’t exist in various other contexts. The difference in between ages 5 and 6 wouldn’t be essential if you’re studying drug usage or retirement planning.