9
$\begingroup$

Consider the population $R^2$:

\begin{equation} \rho^2 = 1- \frac{\sigma^{2}_u}{\sigma^{2}_y} \end{equation}

This equation describes the proportion of the variation in $y$ in the population explained by the independent variables.

Suppose I want to find unbiased estimators for $\sigma^{2}_u$ and $\sigma^{2}_y$, respectively. This task is not accurately accomplished by the typical $R^2$ because it estimates $\sigma^{2}_u$ and $\sigma^{2}_y$ using $RSS/n$ and $(TSS/n)$, respectively, both of which are biased. Here, $RSS$ denotes the sum of squared residuals, and $TSS$ denotes the total sum of squares.

To obtain the adjusted R-squared $\bar{R}^2$, I use unbiased estimators for $\sigma^{2}_u$ and $\sigma^{2}_y$, which are $\frac{RSS}{n-k-1}$ and $\frac{TSS}{n-1}$, respectively. Thus,

\begin{equation} \bar{R}^2 = 1- \frac{RSS/(n-k-1)}{TSS/(n-1)} \end{equation}

Why is $\bar{R}^2$ not unbiased? Why is the ratio of two unbiased estimators not unbiased? Can you provide some intuition?

$\endgroup$
2
  • 1
    $\begingroup$ One small point: It's the amount of variance explained by the model. If you did OLS regression then that would be by the linear relationship. But it's possible that some other model with the same variables would do better (e.g. try fitting OLS to something cyclical). $\endgroup$ Commented Jan 7, 2024 at 11:31
  • $\begingroup$ Apply Jensen's Inequality to the function $(x,y)\to (y/x)$ defined in the first quadrant. $\endgroup$ Commented Jan 7, 2024 at 20:17

1 Answer 1

8
$\begingroup$

Why is the ratio of two unbiased estimators not unbiased?

Consider two uncorrelated random variables, X and Y each with finite expectation.

We immediately have $E[XY]=E(X)\cdot E(Y)$

But in general $E[1/Y] \neq 1/E[Y]$. So even if X and Y were independent, while you would then have $E[X/Y]=E(X)\cdot E(1/Y)$, you don't in general have $E[X/Y]=E(X)\cdot 1/E(Y)$.

In short, ratios of independent unbiased estimators with positive denominator are not unbiased for the ratio of their estimates.

Indeed, if we're dealing with strictly positive $Y$, then special cases aside, Y and its inverse will be negatively correlated, which implies that $E(Y) \, E(1/Y)>1$. We have
$\text{Cov}(Y\cdot \frac{1}{Y})$ $=E(\frac{Y}{Y})-E(Y)E(\frac{1}{Y})$ $=1-E(Y)E(\frac{1}{Y})$, so negative covariance $\implies E(Y)E(\frac{1}{Y})\,>\,1$, or $1/E(Y)<E(\frac{1}{Y})$. Hence, in that case, $E(X)/E(Y)<E(X/Y)$ (under independence).

It may help your intuition a little to simulate random numbers from a variety of distributions with support restricted to within the positive half-line that have finite mean and non-zero variance, and see that the mean of the reciprocal is larger than the reciprocal of the mean.

Here's a quick example with a couple of distributions in R, but you could do something similar in most stats programs, or even in a spreadsheet:

> x = runif(100000,.1,1.1)
> mean(1/x)
[1] 2.396524
> 1/mean(x)
[1] 1.666545

> x=rgamma(100000,3,1)
> mean(1/x)
[1] 0.4984561
> 1/mean(x)
[1] 0.3331215

Try it with your favourite dozen weird distributions. (See if you can break it.)

While so far this is simply motivation for the direction of bias (I didn't prove negative correlation for Y and its inverse, nor state the conditions for it to hold), a more formal argument can be constructed. For example it follows from Jensen's inequality.

$\endgroup$
0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.