-1
$\begingroup$

Problem:

A certain set of values if known to be normally distributed with $\sigma^2 = 1$. However, its mean is not known. The following three sample values are taken: $0, 1, 10$. We want the best possible estimate of the mean.

Answer:

First recall the general form for the density function for the normal distribution is: $$ N(x) = \dfrac{e^{-\dfrac{ (x-\mu)^2 }{2 \sigma^2}} }{ \sigma \sqrt{2 \pi } } $$ However, in this case, since $\sigma^2 = 1$ we have: $$ N(x) = \dfrac{e^{-\dfrac{ (x-\mu)^2 }{2 }} }{ \sqrt{2 \pi } } $$ The best estimate for the mean is a value $\mu$ such that the following expression is maximized: $$ N(0)N(1)N(10) $$ We define $f(\mu)$ to be: $$ f(\mu) = \left( e^{-\dfrac{ (0-\mu)^2 }{2 }} \right) \left( e^{-\dfrac{ (1-\mu)^2 }{2 }} \right) \left( e^{-\dfrac{ (10-\mu)^2 }{2 }} \right) $$ Now we simplify and then differentiate with respect to $\mu$. \begin{align*} f(\mu) &= \left( e^{-\dfrac{ (\mu)^2 }{2 }} \right) \left( e^{-\dfrac{ (\mu-1)^2 }{2 }} \right) \left( e^{-\dfrac{ (\mu-10)^2 }{2 }} \right) \\ % f( \mu ) &= e^{ \dfrac{ -\mu^2 - \mu^2 + 2\mu - 1 -\mu^2 + 20\mu - 100 }{2} } \\ f( \mu ) &= e^{\dfrac{ -3\mu^2 + 22\mu - 101 }{2}} \end{align*} Now to maximize $f( \mu )$ we maximize $g( \mu ) = -3\mu^2 + 22\mu - 101$. \begin{align*} g'( \mu ) &= -6\mu + 22 \\ -6\mu + 22 &= 0 \\ -3\mu + 11 &= 0 \\ \end{align*} Hence our best estimate for the population mean is: $$ \mu = \dfrac{ 11 }{3} $$ For comparison purposes, I want to compute $\bar{Y}$. I would expect my answer to be less than $\bar{Y}$. \begin{align*} \bar{Y} &= \dfrac{ 0 + 1 + 10 }{3} \\ \bar{Y} &= \dfrac{ 11}{3} \end{align*} Therefore, I am thinking my answer might be wrong.

$\endgroup$

1 Answer 1

2
$\begingroup$

The relevant question is, why do you expect your estimate $\hat \mu$ to be less than the sample mean $\bar Y$? What reasoning do you provide for this belief, especially in the context of your calculations, which are correct?

When $\sigma$ is known, the MLE of a normal distribution is its sample mean. In fact, even when $\sigma$ is unknown and must be jointly estimated with $\mu$, the joint MLE $(\hat \mu, \hat \sigma)$ also obeys $\hat \mu = \bar Y$. The proof is left as an exercise for the reader, although it is also easily found in various online resources if desired.

What is not true is $$\hat \sigma^2 = s^2,$$ where $$s^2 = \frac{1}{n-1} \sum_{i=1}^n (Y_i - \bar Y)^2$$ is the unbiased or Bessel-corrected sample standard deviation. The MLE of the variance parameter uses a factor of $\frac{1}{n}$, rather than $\frac{1}{n-1}$. Perhaps this is what you were thinking of instead--i.e., the MLE of the variance is smaller than the sample variance. But this does not apply to the normal distribution's mean.

$\endgroup$
6
  • $\begingroup$ I was told that the MLE is a better estimate of the mean then $\bar{Y}$ by a friend who has a degree in statistics. In addition, given that $10$ is so far away from the mean it seems to me that its value should pull the value way up. Maybe, I should have said that I expect $\mu > \bar{Y}$. However, that is not right either. $\endgroup$ Commented Nov 10 at 5:11
  • 1
    $\begingroup$ @Bob Is it possible that you misunderstood your friend's comment? Certainly you could ask for clarification, both in the general case for sampling distributions with finite mean, and in the specific case of this problem. Regarding your statement "given that $10$ is so far away from the mean...," I must point out that you don't know what the true mean is. That the variance is $1$ does suggest that it is unlikely to see such a value for the estimate $\hat \mu = \frac{11}{3}$, but that is what you have observed. $\endgroup$ Commented Nov 10 at 5:25
  • 1
    $\begingroup$ (cont.) Put another way, although the joint likelihood of $(0,1,10)$ given $\mu = \hat \mu$ is indeed very small, it is nevertheless the largest among all choices of $\mu$, thus it is the maximum likelihood estimate. Your own calculation proves this. $\endgroup$ Commented Nov 10 at 5:27
  • 1
    $\begingroup$ For a normal distribution, the MLE of $\mu$ is $\bar X$, but not for other distributions. Eg, for pdfs of the form $k_1\exp(-(x-\mu)^4/(k_2\sigma^4))$, with mean $\mu$ and standard deviation $\sigma$, I get $\hat\mu=1.49122$ for $(X1,X2,X3)=(0,1,3)$ and $\sigma=1$. I suspect that whether the MLE is more or less than $\bar X$ depends on the kurtosis, but I might be completely wrong about that. $\endgroup$ Commented Nov 10 at 10:03
  • 2
    $\begingroup$ @MichaelHartley For exponential distributions, the MLE of the mean is the mean of the observations, i.e. $\bar X$, even though the excess kurtosis is positive. $\endgroup$ Commented Nov 10 at 19:01

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.