Modifiable Areal Unit Problem



Part of my spatial data science course

In: Other

Let’s look at the Kansas City metropolitan area (KCMA), which spans the states of Kansas and Missouri. Kansas has about 40% of the population in the area, Missouri has 60%. This represents about 30% of the total population of Kansas but only 20% of the population of Missouri.

Now imagine a horrible virus hits the country (let’s call it COVID-21). In the KCMA, 70% of Kansas population is infected, while 35% of the Missouri population is infected. Outside the KCMA, the remaining populations of the two states have 5% infection rates.

If we look at what percent of the states infected, we get the following figures:

|State|Insides KCMA |Outside KCMA|Total % of State Infected|
|Kansas|30% of pop. * 70% rate = 21%|70% * 5% = 3.5%|21% + 3.5% = 24.5%|
|Missouri|20% of pop. * 35% rate = 7%|80% * 5% = 4%|7% + 4% = 11%|

Kansas looks to be way worse off than Missouri! We’d better focus our efforts on Kansas. But wait, the KCMA is the really the hot zone. Let’s look at the infection rates in the KCMA:

|State|Percent of KCMA Population infected|
|KS|40% of KCMA * 70% infection rate = 28%|
|MO|60% of KCMA * 35% infection rate = 21%|
|Total|28% + 21% = 49%|

Almost half the population of the KCMA is infected! We should definitely focus our efforts on the KCMA. But these numbers overrepresent the problem in the Missouri part of the KCMA, which only has a 35% infection rate, while drastically underrepresenting the 70% infection rate in the Kansas part.

So now we’ve looked at the data through a few different geographical groupings and come away with different conclusions. We would determine a different course of action depending on how we aggregate the data. This is modified areal unit problem in action.