Say I have a coin and i want to know “is this coin fair?”
I toss the coin 100 times and it comes up heads 60 times and tails 40 times. Intuitively this seems kind of close to fair, but also a bit skewed. Was this just random variance? Or is this a large enough sample size that a 60-40 split is alarming?
P values give a way to reason about this scenario by asking “if the coin *is* fair, how unlikely is this result?” It turns out that in this case it’s about a 2.8% chance of getting 60 or more heads (and similarly for 60 or more tails).
It’s at this point that people tend to misinterpret p values. The statement people *want* to be able to make is “there is a 2.8% chance that this coin is fair,” but p values do not allow you to make that statement, at least on their own. The p value only says “if the coin is fair then you’d see this result 2.8% of the time.”
Turning a p value into the probability that some hypothesis is correct generally requires knowing some unknowable information. In this toy example that information would be the probability that coins are fair which may be knowable for the right setup, but for more real-world applications it could be something like “the probability that another subatomic particle exists with XYZ properties” (where that probability is either 0 or 1, but we don’t know which). This makes p values somewhat frustrating since they’re so close to making the statement we want, and yet getting that final inch is out of reach.
What p values are very well equipped for is stopping you from publishing results as significant if it turns out you just got lucky. If you took a threshold of p < 0.05 then you might declare that the coin is unfair, but with a more stringent threshold like p < 0.01 you’d declare the test to be inconclusive. With a threshold of p < 0.05 what you’re saying is that you’re OK with calling 1 in 20 fair coins weighted, regardless of how any weighted coins get judged. Different disciplines tend to set p value thresholds at different levels, based on the available data collection. For example, particle physicists like to aim for p < 1/1,000,000 or lower.
Latest Answers