How do statistical tests prove significance?

717 views

I did a biology undergraduate degree and often did reports where would statistically analyse our results. P value of less than 0.05 shows that the results are statistically significant. How do these tests actually know the data is significant? For example we might look at correlation and get a significant positive correlation between two variables. Given that variables can be literally anything in question, how does doing a few statistical calculations determine it is significant? I always thought there must be more nuance as the actual variables can be so many different things. It might show me a significant relationship for two sociological variables and also for two mathematical, when those variables are so different?

In: Mathematics

17 Answers

Anonymous 0 Comments

>Given that variables can be literally anything in question, how does doing a few statistical calculations determine it is significant?

> I always thought there must be more nuance as the actual variables can be so many different things.

I think this is the problem that is missing from many answers.

You always have to MODEL the variables somehow. That is, make some mathematical assumptions about how these variables work. This allows you to analyze the relationship between the variables. If your model is wrong, the work is useless.

If you’re undergraduate, they probably just skip this part because it would require them to teach you math and statistics. Instead, they just give you some formulas based on model they already had in mind. It’s not just undergraduate, even actual researchers can be amazingly bad at statistics. It’s hard to tell how many researchers are bad at statistics, and how many are just downright fraud, but it’s a problem in science.

So it’s important to make a model generic enough that it’s unlikely to be wrong, but specific enough that you can analyze it. On one end, you have Fischer’s type of tests, which are mathematically simple and the maths had been understood since the time of Gauss, but very simplistic and need you to make a lot of assumptions. On the other end, you have all these newfangled deep learning network, in which nobody know how they work, but are a lot more generic.

Once you have a model, you can mathematically analyze to see what kind of data you can get. If you haven’t done this, it’s not possible to quantitatively see if something is significant. In fact, it is quite possible to get amazingly useless result because the analysis is done poorly. This is a huge problem with researches, especially things like nutritional sciences, psychology, sociology.

You are viewing 1 out of 17 answers, click here to view all answers.