Say I get the idea in my head that my local casino’s dice are biased. The 1s are coming up way more often than the 6s, I think. So, I ask them for a die and set out to test my hypothesis.
I roll the die 100 times, and damn, each number came up roughly the same number of times. That can’t be right, I KNOW the die is biased!
I roll the die again 100 times, and damn yet again, the numbers come up roughly the same number of times. That…can’t…be right, I know the die is biased!
I roll another 100 times, another 100 times, another 100 times, and then finally… I roll 100 times and 1 comes up 30 times and 6 only comes up twice! Aha! The dice are biased! I knew it!
The die isn’t actually biased, it’s just that I did so many trial runs that the odds are one of them would make the die look biased. I can publish my results, claiming I only did one trial run, not letting anyone know about all the other thousand trial runs I needed to do in order to get an outcome where the die looks biased.
p-hacking is just this, except instead of doing multiple trial runs, you might look at a dataset in thousands of different ways, trying to correlate thousands of different variables in different ways. Odds are, just by accident, some of the variables are going to be correlated, with a strong enough correlation that you have a publishable result. But the result isn’t real, it’s just statistical noise in the data set that you found as a correlation because you tried to correlate so many different variables; if you looked at fresh data, that correlation would disappear.
To prevent this, before seeing the data, you first commit to which variables you hypothesize there’s a correlation between. Then you look at the data and check the correlation between *only* those variables. The correlation could still be accidental, but it’s far less likely because you’re only checking for one or a few possible correlations, not potentially thousands or millions.
Latest Answers