# The reason we don’t use absolute values for calculating variance

252 views

I know the reason for using squared values is so the values don’t get cancelled out. Why not absolute values? I get it when calculating error metrics that squaring can punish the model more for bigger errors, but I don’t get the reason here.

In: Other

its an arbitrary choice, but the real reason is that ABS is piecewise defined and non differentiable.

Where as squared is differentiable and continuous and a simple function.

There is “Mean absolute deviation” (MAD) which is exactly what you are describing.

I can think of two reasons why variance has become the standard:

From statistical purposes it is often desired that variance puts a higher weight on extreme observations compared to MAD.

From a mathematical point of view the square function is preferred as the square function is fully differentiable ( abs() isn’t differentiable at 0). Hence, making variance easier to handle in more complicated mathematical applications.

There are different measures for volatility. You can consider the mean absolute error or other measure instead of the variance. Variance has a **definition** (involving squares as you know). I am not sure if there is an underlying reason (which you are asking about).

We can make a pattern:

Mean is the average value of x

Variance is the average value of x^2

Skew is the average value of x^3*

Kurtosis is the average value of x^4*

All of these values can describe statistically useful information

*The formulas are more complicated than this, basically accounting for the earlier moments. For example, Variance is actually the average of x^2-(average of x)^2

One reason for using variance, instead of some differently-defined measure of how spread out a data set or probability distribution is, is that if you are adding together random variables that vary in uncorrelated ways (which is a common thing in a lot of real-world applications) then the variance of this sum is just the sum of the individual variances. It makes some calculations simpler.

My stats professor told us absolute values can be used but the math is harder. I see that’s elaborated on already.

Where absolute is used is in electrical engineering to make digital filters. The standard method uses Chebyshev approximation of the second kind followed by the Remez exchange algorithm. This finds the best fitting polynomial with absolute error.