Why do researchers choose to use the “P-Value” rule in data analysis?

194 views

They say .014(the P-Value) is a “significant number”. Says who? Why? Isn’t any number “significant” if the distribution of data points is mostly around that area?

In: 4

9 Answers

Anonymous 0 Comments

The reason P=0.05 is so commonly used is because a Belgian astronomer somewhat arbitrarily decided that he could tolerate 1/20 false positives.

Later in 1935, a chain smoking agricultural scientist (Fisher, sometimes erroneously called the father of the p-value) reasoned in a similar manner:

>The value for which P=0.05, or 1 in 20, is 1.96 or nearly 2; it is convenient to take this point as a limit in judging whether a deviation ought to be considered significant or not. Deviations exceeding twice the standard deviation are thus formally regarded as significant. Using this criterion we should be led to follow up a false indication only once in 22 trials, even if the statistics were the only guide available. Small effects will still escape notice if the data are insufficiently numerous to bring them out, but no lowering of the standard of significance would meet this difficulty.

This answers why an alpha of 0.05 was chosen and subsequently why researchers want a p-value lower than said cut off. Indeed, there was some justification of using 0.05, but that particular value is still quite, or even completely, arbitrary.

>We are not interested in the logic itself, nor will we argue for replacing the .05 alpha with another level of alpha, but at this point in our discussion we only wish to emphasize that dichotomous significance testing has no ontological basis. That is, we want to underscore that, **surely, God loves the .06 nearly as much as the .05**. Can there be any doubt that God views the strength of evidence for or against the null as a fairly continuous function of the magnitude of p?

Consequently, it would be preferable to view the evidence against the null as a continuous function instead of a binary decision based on some sort of level of significance, but it’s not surprising that we ended up the way we did if you know the surrounding contexts in which Fisher, Neyman and Pearson invented null hypothesis testing. Having a controlled long run proportion of false positives is quite useful in manufacturing and quality control, but it might not be as suitable as the sole focus for groundbreaking scientific research.

This answers why a certain level was chosen, but what a p-value actually is and why it’s almost always described and interpreted incorrectly (there’s several examples in this thread alone) is a different question.

You are viewing 1 out of 9 answers, click here to view all answers.