Eli5: What is the Simpson’s paradox in statistics?

896 views

Can someone explain its significance and maybe a simple example as well?

In: 3585

14 Answers

Anonymous 0 Comments

I’m going to steal [Skafi’s example](https://www.reddit.com/r/nba/comments/4p63ku/ysk_simpsons_paradox_and_basketball/):

Before getting to Simpson’s paradox, I’m going to define some basketball terms for anyone who is not familiar. In basketball, there are two types of field goal attempts. 2-pointers and 3-pointers. You can calculate their percentages individually or together as an overall field goal percentage. For example, let’s say that a player attempted 40 2-point field goals, making 30 of them, and attempted 10 3-point field goals, making 3 of them.

Her 2-point% is 30/40 = 75%.

Her 3-point% is 3/10 = 30%.

You can also look at overall field goal % by treating both types of shots the same and disregarding whether they were 2-point or 3-point attempts.

She attempted 50 total field goals (40 2-point + 10 3-point) and made a total of 33 (30 2-point + 3 3-point).

Her overall field goal % is then 33/50 = 66%.

An example of Simpson’s Paradox is the following. Say that you are told the 2-point% and 3-point% for two different players:

Player | 2-Point% | 3-Point% |
—————-|——-|——-|
Larry Bird | 50.9% | 37.6% |
Reggie Miller | 51.6% | 39.5% |

Reggie Miller’s % is higher than Larry Bird’s in both categories. The logical assumption would be that Reggie Miller’s combined field goal% would be higher than Larry Bird’s as well because that Reggie’s percentage is higher in both components of field goal%.

However, the actual values:

Player | 2-Point% | 3-Point% | Overall FG% |
—————-|——-|——-|————-|
Larry Bird | 50.9% | 37.6% | 49.6%
Reggie Miller | 51.6% | 39.5% | 47.1%

How can Larry Bird have a higher overall field goal % when he had a lower percentage for every component of the calculation? It’s because there was another factor not considered.

37% of Reggie Miller’s career field goal attempts were 3-Pointers, while only 10% of Larry Bird’s career field goal attempts were 3-Pointers. Because 3-point field goal attempts have a lower chance of success, Reggie’s 3-point % dragged his 2-point % further down than Larry’s 3-point % dragged his 2-Point % down.

The specific overall field goal% calculations:

Reggie Miller: 51.6%*63% + 39.5%*37% = 47.1%

Larry Bird: 50.9%*90% + 37.6%*10% = 49.6%

Again, you can see that Reggie’s overall field goal% was much more influenced by the relatively less likely 3-pointers than Larry’s was.

You are viewing 1 out of 14 answers, click here to view all answers.