When should one of mean, mode, and median be used over the other

146 views

When should one of mean, mode, and median be used over the other

In: 15

17 Answers

Anonymous 0 Comments

Mean is the average of a set of numbers.

Mode is the most common value in a set of numbers

Median is the value that is right in the middle of the set, when the set is arranged in a sequence.

Which you use greatly depends on exactly what you want to know, given what each tells you.

So, for example, if I wanted to know _about_ what it costs to purchase blueberries in the grocery store, I would use the mean to get the average value.

If I wanted to know what I was most likley to pay, I would use the mode to see what value occurs most often.

If my data set had outliers (like a really pricey store that was 3x more expensive than everywhere else) I might use the median to see the value right in the middle.

Anonymous 0 Comments

If I’m trying to figure out my average cost for buying a set of products, mean is how I do it. That’s probably the best example for that. When you’re looking at REALLY big sets of numbers and individual outliers are not a problem, mean works great, again with things like investments or large purchases.

If its something where the individual experience matters, median works great. What are the average winnings of a lottery player? If I ask for mean, the number might be 3 dollars for a 5 dollar ticket. If you look at median it’s 0, since significantly more than half of tickets win nothing. One pertinent at my job- the mean amount of experience nurses at my job have in the specialty is something like 4 years. The median that they have is 8 months, because there’s a handful of people with 10 to 20 plus propping up the average, so the average patient gets a nurse with less than a year of experience.

Mode works similar to median where you’re trying to grasp the average individual experience- it’s rare that it’s a better tool. It works great for sets with really high variability to try to establish a pattern or find bottlenecks in a process. Looking at personal annualized incomes, for example, there’s a near infinite number of values for them, with the median in the 10s of thousands and a mean closer to 100s, but the mode is 0. Also a fun one for personally reported heights by men- the median might be 5 foot 9, the mean around there too, but 6 foot might be the mode, since there’s a LOT of rounding up as you approach that height and get close to that number.

Anonymous 0 Comments

In addition to the distinctions that Ansuz07 made, I would offer that mean, median, and mode work with different levels of measurement. With *interval* and *ratio* levels of measurement (actual numbers, like an age or a dollar value), you can use all three. But sometimes that dataset is *ordinal*, meaning you can rank it but it doesn’t have specific numbers. Imagine that you’re describing the educational level of people in your neighborhood:

* 10% don’t have a high school diploma
* 45% have only a high school diploma
* 30% have a bachelor’s degree
* 10% have a master’s degree
* 5% have a doctorate degree

In this case, there’s no way to calculate a mean, but knowing the median is a high school diploma tells you something about the “average” education level of the area.

Sometimes data is *nominal,* meaning that you can’t even rank it, so the only measure of central tendency that you have is the mode.

Finally, I should note that (arithmetic) mean, median, and mode aren’t the only measures of central tendency. They’re just the most common. You have the weighted means, harmonic means, geometric means, trimeans, and lots of others. It gets even more complicated when the data is multi-dimensional, such as with coordinates. But I guess I’m getting far afield from your question.

Anonymous 0 Comments

Mean is all the values added together divided by the total number of values.

Median is the halfway value, half the sample has higher values, half the sample has lower ones.

Mode is the most common value.

Mean is best used when there aren’t excessive outlier values, and/or those values occur somewhat equally on each side, and close to something called the “normal distribution”. Mean is useful as there are a lot of statistical analysis that can be readily done on a data distribution if you know the mean and some other components of the data distribution.

Median is good for when you have major outliers on one side that seriously skew the results in a way that distorts the mean to an extent it’s usefulness is degraded. Income is a good example of this when you have a serious skew of income towards the upper end. A handful of people earning tens/hundreds of millions of dollars a year drags the Mean up quite a bit, but that tiny portion of the populations income is mostly irrelevant for understanding what’s happening to the population as a whole.

Mode is a bit of an odd one and is best used when there’s either discontinuous data or the data inherently doesn’t average/can’t be rationally ordered. For example: “what is the average car color”.

Anonymous 0 Comments

These are all ways of determining what the “average” of something is.

The mean just finds the answer in the middle and leaves it at that. Useful when you want to know the average of something reasonably continuous where you need to know roughly what to expect based on the data you have. Things like how long does it take to travel a particular route to work or how much rainfall you expect to have each month are usually looked at as the Mean. The catch is that it can be influenced by a limited number of outliers (really big or really small values)

Median is where the “average” answer is one that is reasonably common but it’s still better for continuous data. It’s more useful where you have extreme values at one end of the scale that can make the mean look higher than is actually common. Pay data is a good example where this is used. No-one earns less than zero money in their job and a few people earn massive amounts. If you take the mean, it can make it look like the average worker has a lot more money than most people actually do. The median pay is the the amount that half the population earns more than and half earn less than which is usually a lot more useful.

The mode is the most common answer. This is mostly only useful when the answers are discrete – that is, you can either be one or the other not in between. With continuous data, 1.99, 2 and 2.01 are all different answers unless you group them first. On the other hand, calculating whether the average pet is a cat or a dog is impossible with a mean – if 1/3 of pet owners have a cat, 1/3 have a dog and 1/3 have both then the mean is half cat, half dog which doesn’t make any sense. Also good for things like what colour car do people prefer or anything else where you have distinct answers and can’t end up in between

Anonymous 0 Comments

The other, infinitely snarkier answer, is that it depends on the lie you’re trying to support.

Anonymous 0 Comments

If you have 5 dogs and one of them has 3 legs, the average (mean) number of legs is 19/5=3.8 legs.

But none of the dogs actually has 3.8 legs. Most people would say that an ordinary “average” normal dog has 4 legs. The mode and the median are both 4 legs. Median an mode are better at telling you how many legs an average dog has.

The mode is the number that comes up most, so that’s easy. The median is the middle number if you line the numbers all up. In this case: 3,4,4,4,4. It’s still 4.

Anonymous 0 Comments

When you have a normal distribution, all three of those metrics are the same value.

If the distribution is left or right-skewed, the median will move away from the mean.

Anonymous 0 Comments

mean is the most intuitive, and you can usually use it if you know the data to be normally distributed. things like adult height, weight, etc.

if the data is highly skewed, you want to use the median. like say if you want to know the average salary in a group of people but one of them is an NBA player, that one guy’s humongous salary will have a huge influence on the average. so you want to look at the value in the middle when lined up in order, that is a better indicator of “average” value.

mode is more for when you dont really care about the average but more want to know the most common values. Typically for things with pretty low values. A good example is number of cars owned by a household. What’s more useful, knowing the average number of cars is 2.47, or knowing the most common number of cars in a family is 2?

Anonymous 0 Comments

The method of averaging needs to be relevant and representative of the data you have.

Let’s say you’re comparing incomes in a group of 10 people: 0, 0, 5k, 20k, 50k, 60k, 80k, 100k, 120k, 1m.

The mode is 0 – which doesn’t make any sense.

The mean is 1,445,000 / 10 = 144.5k – also doesn’t make sense as only one person earns the mean or above.

The median is 55k – which makes the most sense – half of the people earn above and half below this number.

Often, interested groups will misrepresent the data by deliberately selecting a misleading average.

For example, an article trying to demonstrate prosperity would say “the average income is $144.5k!” Which, while true, is misleading.