Why do researchers choose to use the “P-Value” rule in data analysis?

193 views

They say .014(the P-Value) is a “significant number”. Says who? Why? Isn’t any number “significant” if the distribution of data points is mostly around that area?

In: 4

9 Answers

Anonymous 0 Comments

The p-value, or the statistical significance, tells you how likely it was that you could have gotten the same kind of data, and drawn the same conclusions, from random chance alone. A higher significance means it’s highly unlikely that the data was a random fluke.

Take a simplified example. You flip a coin 4 times and get heads all 4 times. Can you conclude the coin is biased toward heads? If you calculate the probability of getting 4 heads in a row with a fair coin, it is 1/16 = 0.0625. This is essentially the p-value. This tells you there’s about 6% chance you could have gotten this result by pure luck, which is rather high, and so the results are not significant enough to conclude the coin must be biased.

On the other hand, if you had gotten 10 heads in a row, then the p-value becomes 1/1024 = 0.000977. This is extremely unlikely to happen by chance alone, and so justifies a significant suspicion that the coin is not fair.

In real data analysis, calculating the p-value is a bit more complicated than the coin toss scenario, but it does represent the same idea. The question could be “given this number of smokers who developed lung cancer, can we conclude that smoking is linked to lung cancer?”, and the p-value will tell you what was the chance that those smokers would have developed lung cancer by chance, regardless of having smoked. When the p-value is low enough, that makes for significant evidence linking smoking to lung cancer.

You are viewing 1 out of 9 answers, click here to view all answers.