It all depends on what you want to achieve with your analysis.
Each tool has some very interesting properties that you should keep in mind:
(Caveat: set as in data set, not math set where values don’t repeat)
How many of mean, mode, or median can there be for a set of data?
Mode: Multiple
Mean: Only 1
Median: Only 1
Is calculated mean, mode, or median also present in the data set?
Mode: Always
Mean: Sometimes
Median: Always in a set with an odd count, sometimes in an even count set where values can repeat and never in an even count set where values can’t repeat
How much information about all the data values in a set does the mean, mode, or median provide?
Mode: Only information is that all the other values are not the mode(s) and no other value occurs as much as the mode(s)
Median: ~Half of the values are less than the median and ~Half of the values are greater than the median.
Mean: The differences between all the values and the mean sum to zero. Also stated as for all the values greater than the mean, the sum of their differences from the mean is the same as the magnitude of the sum of the differences from the mean for all the values less than the mean.
How can the mean, median, or mode be affected if I add a new value to the set?
Mode: Either no change, or creates an additional mode, or culls multiple modes into a single mode.
Median: Likely a small change, size of change in median not affected by how far away from the median the added value is
Mean: Likely a small change, adding low or high values causes a bigger change to the mean than adding a value close to it. Also highly affected by how many values are already in the set. Adding a value to a small set has the potential to cause a bigger change than adding a value to a large set
How many of the previous values do I need to remember to calculate the new mean, median, or mode when I add a new value to the set?
Mode: I need to remember all values
Median: I need to remember all values
Mean: I only need to remember how many values were in the set. The values themselves can be forgotten.
As for when to use each, keeping in mind some of the properties I talked about already, here are some general guidelines:
Mode: Useful when determining when a value is most popular. Which restaurant has the most votes among your group of friends before a night out? Who won the election?
Median: If I want to split up a set into two smaller half-size sets by value, the median is the split point. If I would normally use mean as the average, but the data set has one extremely low or high value, the median will be much closer to all the other values than the mean (this is typically why median income is used as a stat instead of mean income, the presence of CEOs with very large compensation packages makes mean income much higher than most people’s income)
Mean: Useful when trying to find a value that all of the values in a set are “close” to. What test score represents the value closest to the score all of the other students received? If I want to replace all of my workers with robots who have constant output, what value of robot worker output will I need to replace my human workers?
Latest Answers