Let’s say Josh does a study and shows with 90% confidence that the smell of freshly cooked brownies while taking an exam lowers test scores due to people being distracted by the smell. 90% confidence sounds good but it only means there is a 1 out of 10 chance that Josh’s study is correct.
Now let’s say Charlie reads this study and he has the thought “I wonder if I can reproduce this study with chocolate chip cookies!”. So Charlie tries cookies and finds it doesn’t work. Then Charlie says “that must mean chocolate cookies do not smell good enough to distract people on tests”. So Charlie continues to try the study again on cakes, candy, ice cream, double stuff Oreos, and more. On his 10th attempt, he tries Starbucks cake pops and Charlie shows with 90% confidence that the smell of Starbucks cake pops lowers test scores and publishes his results.
Now Emily reads two studies on this topic and thinks “wow this is a really well established theory” and writes a review on how we now have multiple studies showing that the smell of any fresh deserts can distract test takers.
This is p-hacking. Basically, you discard all the cases that don’t work but the problem is that you are only 90% (9 out of 10) confident and you tried 10 times. This means that you are very likely to find the result you were looking for by chance.
90% confident is related to what people call the p- value and hacking is discarding relevant data that doesn’t fit your theory.
Latest Answers