Average absolute deviation
The average absolute deviation, or mean absolute deviation, of a data set is the average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. In the general form, the central point can be a mean, median, mode, or the result of any other measure of central tendency or any random data point related to the given data set. The absolute values of the differences between the data points and their central tendency are totaled and divided by the number of data points.
Measures of dispersion
Several measures of statistical dispersion are defined in terms of the absolute deviation.The term "average absolute deviation" does not uniquely identify a measure of statistical dispersion, as there are several measures that can be used to measure absolute deviations, and there are several measures of central tendency that can be used as well. Thus, to uniquely identify the absolute deviation it is necessary to specify both the measure of deviation and the measure of central tendency. Unfortunately, the statistical literature has not yet adopted a standard notation, as both the [|mean absolute deviation around the mean] and the [|median absolute deviation around the median] have been denoted by their initials "MAD" in the literature, which may lead to confusion, since in general, they may have values considerably different from each other.
Mean absolute deviation around a central point
The mean absolute deviation of a set isThe choice of measure of central tendency,, has a marked effect on the value of the mean deviation. For example, for the data set :
Measure of central tendency | Mean absolute deviation |
Mean = 5 | 14 - 5 |
Mean absolute deviation around the mean
The mean absolute deviation, also referred to as the "mean deviation" or sometimes "average absolute deviation", is the mean of the data's absolute deviations around the data's mean: the average distance from the mean. "Average absolute deviation" can refer to either this usage, or to the general form with respect to a specified central point.MAD has been proposed to be used in place of standard deviation since it corresponds better to real life. Because the MAD is a simpler measure of variability than the standard deviation, it can be useful in school teaching.
This method's forecast accuracy is very closely related to the mean squared error method which is just the average squared error of the forecasts. Although these methods are very closely related, MAD is more commonly used because it is both easier to compute and easier to understand.
Mean absolute deviation around the median
Mean absolute deviation around the median offers a direct measure of the scale of a random variable around its medianThis is the maximum likelihood estimator of the scale parameter of the Laplace distribution. For the normal distribution we have. Since the median minimizes the average absolute distance, we have. By using the general dispersion function, Habib defined MAD about median as
where the indicator function is
This representation allows for obtaining MAD median correlation coefficients.
Median absolute deviation around a central point
Median absolute deviation around the mean
In principle the mean could be taken as the central point for the median absolute deviation, but more often the median value is taken instead.Median absolute deviation around the median
The median absolute deviation is the median of the absolute deviation from the median. It is a robust estimator of dispersion.For the example : 3 is the median, so the absolute deviations from the median are with a median of 1, in this case unaffected by the value of the outlier 14, so the median absolute deviation is 1.
Maximum absolute deviation
The maximum absolute deviation around an arbitrary point is the maximum of the absolute deviations of a sample from that point. While not strictly a measure of central tendency, the maximum absolute deviation can be found using the formula for the average absolute deviation as above with, where is the sample maximum.Minimization
The measures of statistical dispersion derived from absolute deviation characterize various measures of central tendency as minimizing dispersion:The median is the measure of central tendency most associated with the absolute deviation. Some location parameters can be compared as follows:
- L2 norm statistics: the mean minimizes the mean squared error
- L1 norm statistics: the median minimizes average absolute deviation,
- L∞ norm statistics: the mid-range minimizes the maximum absolute deviation
- trimmed L∞ norm statistics: for example, the midhinge which minimizes the median absolute deviation of the whole distribution, also minimizes the maximum absolute deviation of the distribution after the top and bottom 25% have been trimmed off.
Estimation
In order for the absolute deviation to be an unbiased estimator, the expected value of all the sample absolute deviations must equal the population absolute deviation. However, it does not. For the population 1,2,3 both the population absolute deviation about the median and the population absolute deviation about the mean are 2/3. The average of all the sample absolute deviations about the mean of size 3 that can be drawn from the population is 44/81, while the average of all the sample absolute deviations about the median is 4/9. Therefore, the absolute deviation is a biased estimator.
However, this argument is based on the notion of mean-unbiasedness. Each measure of location has its own form of unbiasedness. The relevant form of unbiasedness here is median unbiasedness.