What is not so well known is the median is the constant that minimizes , while the mean is the constant that minimizes .
This raises a sort of obvious, if not at all intuitive, generalization of the median and mean of a finite data set.
Let’s define to be the constant that minimizes .
The median of is just and the mean is .
So what about and ?
As I’m writing this I have little to no intuitive feel for what these “odd-beast” measures of central tendency” are actually indicating.
Some things are sort of obvious without much thought. For example, if for all then for all .
How does vary with increasing k?
To illustrate, let’s take a data set of 100 uniformly random numbers between 0 and 100.
Then a plot of for looks as follows:
For a set of 100 numbers randomly chosen form a standard normal distribution we get the following typical plot of with k:
By randomly choosing set of 100 numbers between 0 and 100 from a U-shaped distribution we get quite a different phenomenon:
I have no idea (yet!) how varies theoretically with k for different distributions, nor any ideas (yet!) whether these odd-beasts deserve the name of measures of central tendency.
But, whatever, I’m fascinated by them.
If you’ve seen anything like this before please let me know.