eli5 Error bars in experimental biology

154 views

I’m having a hard time understanding the difference between statistical significance, inferential and descriptive statistics, standard error and confidence intervals. Can someone dumb it down for me please

In: 6

2 Answers

Anonymous 0 Comments

Error bars in experimental biology represent the variation or uncertainty in data points, providing a visual representation of the data’s reliability

Understanding statistical significance, inferential and descriptive statistics, standard error, and confidence intervals helps interpret the data’s validity within a study.

Anonymous 0 Comments

**Descriptive statistics** answer questions such as “what is this data like?” For example, you could calculate the mean of a set of measurements (called the “sample mean”, since it’s the mean of the sample you collected), or the standard deviation (“sample standard deviation”). These tell you about your data, which is just a tiny reflection of everything that’s out there.

Eg, if you collect a lot of frogs from different regions, and [measure how far they can jump](https://www.journals.uchicago.edu/doi/full/10.1086/527494), then calculate (descriptive) statistics, the things you calculate are about the frogs you *collected*. However, what you’re *really* interested in, perhaps, is the frogs still out there, jumping around in the real world.

**Inferential statistics** answer questions such as “what is the real world like?” You have some *samples* of different frogs, and there *seem* to be differences between them, but do these differences imply (ie, allow you to *infer*) that there are real differences in the wild?

Inferential statistics include things like “p-values”, or “test statistics”, or “confidence intervals” which are mathematical tools to help precisely pin down what your data is saying, and how uncertain you must accept that you still are.

**Statistical significance** is a related to inferential statistics. You might decide, in advance, how willing you are to be wrong when you say “there’s a real difference between the frogs”. Then, you do some math to calculate the relevant inferential statistics, and you can then boldly conclude either “these samples are *statistically significantly different* from each other” or “there’s *no statistically significant* difference between the jumping ability of these frogs”. Whether it’s statistically significant or not basically translates to “if there *really* was *no* difference, would I have to be crazy unlucky to get data showing a difference this big?” The answer is either yes or now, because you’ve decided in advance exactly what counts as “crazy lucky” – that’s the “significance level” of the data, typically 5%, but sometimes 1% (or even less in specific fields of study)

**Confidence intervals** are another inferential statistic. Again, you pick how unlucky you need to be to make a mistake, eg, 5%. Then you do some calculations, and instead of a simple Yes/No answer (“The frogs have/do not have statistically significantly different jumping lengths”), you get more quantitative:

>”With 95% confidence, we can say that frogs from region A jump between 5cm to 23cm longer than those from region B. This is a statistically significant difference at the 5% level”.

Assuming you’ve done your data collection and maths correctly, there’s a 5% chance that the true difference for the whole population lies in that range – that’s why it’s a 95% confidence interval.

If you wanted to be less likely to be wrong, you might choose a significance level of 1%, that is, a 99% confidence interval, and then your report might read

>”With 99% confidence, we can say that frogs from region A jump between 2cm shorter to 30cm longer than those from region B. This difference is not statistically significant at the 1% level. “