– What is a standard deviation?

363 views

So I understand the concept of a distribution. I just don’t understand the concept of a standard deviation from the mean of that distribution. How can we tell what is 1 standard deviation away as opposed to 2 standard deviations?

I’d be very thankful for an explanation, thank you :).

In: 9

5 Answers

Anonymous 0 Comments

Standard deviation is a way to talk about how much more or less something might be the next time you get one, and how often you’ll get a lot more or less.

Let’s say you get a big bag of M&M’s at the store and count how many M&M’s are inside (before you eat any). Because they are small and there are a lot of them, it makes sense that some bags might have a few more than others, but usually it’s about the same number.

If you got all of the bags at the store, you could count all of the M&M’s, and put the same number into each bag to “fix” them. This number is called the _mean_ and is a kind of _average_, though often when people say “average” the mean is the one they… mean. And if you have a few M&M’s left over, the mean has a fraction added to the number that went into each bag to account for the extra.

Now that we know what the mean is, we can talk about how much any of the bags _varied_ from the mean. Maybe one bag was 10 M&M’s short of the mean, or another 5 over. If we add up all of the differences, we’d get a total difference of 0 because it would all average out, that’s what the mean does. So we’d like a way to talk about the differences, a way that more or less both add to the amount of difference, and (importantly), the more it’s different the more “difference” it adds. After all, you’ll care more about buying one bag that’s 20 short than two bags that are 10 short each, because if one was 20 short, maybe the next is 20 short, too! So what statisticians do is _square_ the differences by multiplying each difference by itself. So +20 or –20 both become +400 (since negative numbers have positive squares), and two 10’s would be +100+100 = +200, so two 10’s is less _variance_ than one 20.

This variance now tells us how much variation there is in the number of M&M’s in the bags. But because we multiplied, for example, 20 M&M’s by 20 M&M’s, we got a result of a variance of 400 M&M’s-squared. We don’t eat squared M&M’s. So we take the square root of the variance to return to the unit we care about, M&M’s. That’s the _standard deviation_, standard because we’ve fixed the unit.

Now for each bag we can standardize the difference we measured by comparing the _error_, how many M&M’s the bag is over or under the mean, against the standard deviation. The unit (M&M’s) cancels out and we get a simple number without units, so we can compare standard deviations against other samples, like maybe a different size of M&M’s bag, or Reece’s Pieces.

That’s useful in itself, but it has certain mathematical properties that are more useful. For example, Chebyshev’s Theorem says that the number of M&M’s in a bag should be within two standard deviations of the mean at least 75% of the time. We don’t know how many M&M’s will be in the next bag we buy, but we have an idea of a range of how many to _expect_, unless something strange happened at the factory. If our data seems to have certain properties, we might be able to describe it as a “normally distributed variable” and, if that assumption is okay, we can make more accurate predictions on how often a bag will have a certain range of M&M’s in it. For example, if the M&M’s bags’ candy count is approximately normal, then we can say that we expect 68% of bags to have a number within one standard deviation of the mean, and 95% of the time to be within two standard deviations.

One thing we need to be careful about is that we don’t know the standard deviation of the M&M bag machine. Often we _can’t_ know, so statistics is about _estimating_ (guessing at) these unknowns. So when we calculate the standard deviation from a sample, we usually make it a little bit bigger because we don’t know everything, and the amount added is more when we have fewer samples since we didn’t study much. Because of that, you’ll see σ used to represent the usually unknown true standard deviation, but _s_ when we calculate from a sample, which is the correct value of the sample but only an estimate of what σ is. And with more math, we can figure out how near to _s_ σ is likely to be. How far you go depends on how much information you need.

You are viewing 1 out of 5 answers, click here to view all answers.