Here’s something that crops up all the time in indicators, but is discussed less than you might think. It’s also more complicated than it looks.

Denomination, despite the way that it sounds, is not about practices found in the murkier corners of the internet. It refers to the operation of dividing one indicator by another. But why should we do that?


Size matters

As anyone who has ever looked at a map will know, countries come in all different sizes. The problem is that many indicators, on their own, are strongly linked to the size of the country. That means that if we compare countries using these indicators directly, we will often get a ranking that has roughly the biggest countries at the top, and the smallest countries at the bottom.

To take an example that is almost absurd, consider employment. If we want to compare the employment between different countries, one possibility would be to measure the number of people in employment. I downloaded the OECD employment data for OECD countries, for Q4 2019 (the latest quarter for which all countries had reported data). Below is a sorted bar chart simply showing the number of employed people for each country (using Plotly).

LUXLVALTUSVKNZLDNKISRHUNBELSWECHLAUSCANCOLFRATURDEUJPNEA19EU27_2020G-70100k200k300k400k500k600k
Total employment by countryPeople employed (thousands)

All the top countries, by this metric, are simply the ones with the most people in: the USA, Russia and Japan. Now, let’s denominate this indicator. In this case, it is divided (denominated) by working-age population. Here’s what we get:

GRCMEXCHLBELKOREA19POLISREU28HUNRUSUSALVAFINAUSCZENORESTSWEJPNCHE01020304050607080
Employment rate by country% working-age population in employment

Now the picture is completely different. The highest employment rate, i.e. employed people per unit population, is in the Netherlands, followed by Japan and New Zealand. What we are looking at is essentially a measure of employment per capita. This measure is now (mostly) independent of the size of the country.

The rankings here are completely different because the meanings of these two measures are completely different. Denomination is in fact a nonlinear transformation, because every value is divided by a different value (each country is divided by its own unique population value). That doesn’t mean that denoninated indicators are suddenly more “right” than the before their denomination, however. While employment rate is probably more often compared across countries, the total number of employed people might also be interesting in terms of measuring total labour force capacity and in absolute comparisons between countries. It is also of course a useful statistic to the country for internal matters.

What is important is that the indicator is suited to its purpose and context. More often than not, in international indicators and scoreboards, the most suitable indicators are those that are denominated, so as to make meaningful comparisons between countries of different sizes.


Intensive vs. extensive

More precisely, indicators can be thought of as either intensive or extensive variables. Intensive variables are not (or only weakly) related to the size of the country, and allow “fair” comparisons between countries of different sizes. Extensive variables, on the contrary, are strongly related to the size of the country.

This distinction is well known in physics, for example. Mass is related to the size of the object and is an extensive variable. If we take a block of steel, and double its size (volume), we also double its mass. Density, which is mass per unit volume, is an intensive quantity: if we double the size of the block, the density remains the same.

  • An example of an extensive variable is population. Bigger countries tend to have bigger populations.
  • An example of an intensive variable is population density. This is no longer dependent on the physical size of the country.

The summary here is that an extensive variable becomes an intensive variable when we divide it by a denominator.


Simple?

So far, this is fairly intuitive. It is obvious when to use population density rather than population, and when to use employment rate rather than the total number of employed people.

Some cases are more tricky. Take innovation. One typical indicator of innovation is the number of academic articles published in (top) journals. But obviously, this is an extensive indicator: bigger countries will be able to publish more papers. So, what to do?

Well, to publish papers you need at least (a) people to write the papers, and (b) money to fund the research. This suggests that publications are linked to both population and the economic size of the country, i.e. GDP. So which one is the best to dominate by?

As with most things, it depends, and the meaning will change depending on which denominator you use, so that should be very carefully considered. A statistical way of looking at it, however, is through correlation.

Let’s take some data from our ASEM Sustainable Connectivity Portal. We have international publication data for 51 ASEM countries, as well as population and GDP values, and a simple measure of GDP per capita by dividing one by the other. Here’s what it looks like (using the reactable package):

Country
Int_Publications
GDP
Population
GDPpc
Austria
14432
390.7999911
8712.137
44.85696117
Belgium
20176
467.9552674
11358.379
41.19912422
Bulgaria
1854
53.23964349
7131.494
7.465426387
Croatia
2342
51.23100261
4213.265
12.15945416
Cyprus
1337
20.04623431
1170.125
17.13170329
Czech Republic
9190
195.3050849
10610.95
18.40599427
Denmark
15986
306.8996534
5711.87
53.73015377
Estonia
1594
23.33792056
1312.442
17.78205861
Finland
10303
238.5029003
5503.132
43.33948383
France
56677
2465.453975
64720.69
38.09375294
1–10 of 51 rows

We can check and visualise the correlations between these different variables using the ggally package:

Now we can see that international publications are most strongly linked to GDP, with a highly-significant correlation of 0.77. This is no proof of causality (we can’t conclude that more GDP causes more publications), but the two clearly have a strong link. Common sense also tells us that money is handy in generating more research. This might point to the fact that if we want to remove the “size effect” in this case, we should consider denominating by GDP rather than population.


Concluding

Denomination is a simple operation but it needs to be handled carefully. You need to understand what the final meaning of the indicator actually is, and ensure that it fits with what you are trying to measure. It’s surprisingly easy to slip up. Correlations can help to point to available options. The most common denominators include population, GDP and land area, but there could be potentially many others.

So, the big question: does size really matter? Well it depends: for extensive variables it does, and for intensive variables it doesn’t.