How do statistical tests prove significance?

719 views

I did a biology undergraduate degree and often did reports where would statistically analyse our results. P value of less than 0.05 shows that the results are statistically significant. How do these tests actually know the data is significant? For example we might look at correlation and get a significant positive correlation between two variables. Given that variables can be literally anything in question, how does doing a few statistical calculations determine it is significant? I always thought there must be more nuance as the actual variables can be so many different things. It might show me a significant relationship for two sociological variables and also for two mathematical, when those variables are so different?

In: Mathematics

17 Answers

Anonymous 0 Comments

As others have noted, “significance” in the sense of p-values just refers to how often you would observe some data pattern from random chance alone as a sample of the total possibilities of the system you are observing. It doesn’t prove anything in and of itself – all a p-value can do is say that there is a pattern that seems to *disprove* the null hypothesis (i.e. your data arose from random chance alone).

It is important to set this up and use it correctly: if you are testing lots of things randomly, you need to account for that properly in the stats test you use, eg applying an F- rather than multiple t-tests, or using a Bonferroni correction where you divide any observed p-value by the m # of hypotheses you are testing. Otherwise you are just *cherry-picking* i.e. throwing stuff at the wall for what sticks, without really explaining or learning anything about stickiness.

Separately…

It’s worth noting that the mechanics of specifying a null hypothesis, “significance”, and the meaning of p-values under a “frequentist” paradigm are not intuitive to most humans at all.

The math has historically been trickier to calculate, but with modern computing, “Bayesian stats” paradigms are easier to understand. It is simply about the level of confidence I have that something is true or not, and I can use that paradigm to synthesize lots of evidence from different previous study designs and setups, as long as I have accurate figures and confidence in random sampling from each.

In real life (and science) we use prior knowledge and theory all the time.

If I am walking along a dark street at night and I see a jewelry store that has a broken window and merchandise strewn about, I can be confident enough that a robbery has taken place to call the police. I can triangulate from other knowledge without needing to have randomly seen the exact same scene many times before.

You are viewing 1 out of 17 answers, click here to view all answers.