Validity of approximating a covariance matrix by making use of a probability limit?

Question

I want to know can we approximate the covariance matrix of a random vector by making use of a probability limit.

Define the linear regression model in matrix form as $$ \mathbf{Y} = \mathbf{X} \beta + \varepsilon, $$ where the variance of $\varepsilon$ is $\sigma$.

I am interested in approximating $E[\text{Cov}[A|\mathbf{X}]]$ defined by

$$ E[\text{Cov}[\hat \beta|\mathbf{X}]] = E\bigg[\frac{\sigma^2}{n} \bigg(\frac{\mathbf{X}^T\mathbf{X}}{n}\bigg)^{-1}\bigg] = \frac{\sigma^2}{n} E\bigg[\bigg(\frac{\mathbf{X}^T\mathbf{X}}{n}\bigg)^{-1}\bigg]. $$

The probability limit of $\mathbf{X}^T\mathbf{X}/n$ is $$ \text{plim}_{n\to \infty} \bigg(\frac{\mathbf{X}^T\mathbf{X}}{n}\bigg) = Q, $$ where $Q$ is a constant positive definite matrix (see Econometric Analysis by William Greene, eq. 4-19). So the probability limit of the inverse $(\mathbf{X}^T\mathbf{X}/n)^{-1}$ is $$ \text{plim}_{n\to \infty} \bigg(\frac{\mathbf{X}^T\mathbf{X}}{n}\bigg)^{-1} = Q^{-1}. $$

For large $n$, I am interested in approximating $E[\text{Cov}[\hat \beta|\mathbf{X}]]$ by using the probability limit, that is, saying something like $$ E[\text{Cov}[\hat \beta|\mathbf{X}]] \approx \frac{\sigma^2}{n} Q^{-1}, \quad \quad \text{or} \quad \quad E[\text{Cov}[\hat \beta|\mathbf{X}]] \sim \frac{\sigma^2}{n} Q^{-1}. $$ I have various questions regarding the validity of doing this.

What kind of error are we making if we can do this? Is there a way to account for the error? Is this a situation where we have an approximation that 'holds with high probability'? If we can indeed make this approximation, how do we rigorously state it mathematically (precisely what does $\approx$ or $\sim$ signify)?

What is $A$? It appears to have the same covariance matrix as $\beta$, under the usual assumptions. — Alecos Papadopoulos
– Alecos Papadopoulos, Commented Dec 4, 2020 at 13:38
$A$ is actually the estimated linear regression coefficients $\hat \beta$ (see here). — sonicboom
– sonicboom, Commented Dec 4, 2020 at 14:16

Alecos Papadopoulos · Accepted Answer · 2020-12-10 18:44:59Z

In "standard linear regression" with strict exogeneity, $E(\varepsilon \mid \mathbf X) = 0$, the OP wants to approximate (pursuing a theoretical result) the unconditional variance of $\hat \beta$ by using the probability limit of the the moment matrix.

By the Law of Total Variance and the fact that $E(\hat \beta \mid \mathbf X) = \beta$, we have that the unconditional variance is

$${\rm V}(\hat \beta) = \sigma^2 \cdot E\Big[(\mathbf X' \mathbf X)^{-1}\Big] = \frac{\sigma^2 }{n}\cdot E\Big[(n^{-1}\mathbf X' \mathbf X)^{-1}\Big]$$

We approximate this by

$${\rm V}(\hat \beta) \approx \frac{\sigma^2}{n} \cdot Q^{-1},$$

where

$$Q = {\rm plim}\left(n^{-1}\mathbf X' \mathbf X\right) = E(\mathbf x \mathbf x')$$

where $\mathbf x$ is the typical row vector of $\mathbf X$ and is used because at the limit the matrix $X$ has infinite row dimension, so it would be inappropriate to use it as the result of a limiting expression.

In words, instead of the expected value of the inverse, we use the inverse of the expected value.

The approximation error is

$$\delta(n) =(\sigma^2 /n) \cdot \Big[E[(n^{-1}\mathbf X'\mathbf X)^{-1}] - [E(\mathbf x'\mathbf x)]^{-1}\Big].$$

We have that $(n^{-1}\mathbf X'\mathbf X)^{-1} \longrightarrow_p [E(\mathbf x'\mathbf x)]^{-1}$, so

$$E[(n^{-1}\mathbf X'\mathbf X)^{-1}] - [E(\mathbf x'\mathbf x)]^{-1} \longrightarrow E[(E(\mathbf x'\mathbf x))^{-1}] - [E(\mathbf x'\mathbf x)]^{-1} = 0, $$

so this expression is $o(1)$. Also, $(\sigma^2/n) = O(1/n)$. Therefore,

$$\delta(n) = O(1/n)\cdot o(1) = o(1\cdot 1/n) = o(1/n).$$

So the approximation error goes to zero faster than $n$ goes to infinity.

UPDATE
Can we improve on the $o_p(1) / o(1)$ rate of convergence of

$$E[(n^{-1}\mathbf X'\mathbf X)^{-1}] - [E(\mathbf x'\mathbf x)]^{-1},\;\;\; ?$$

Apparently, the OP needs that. Let's see.

The OP mentioned a remark in Bruce Hansen's Econometrics textbook, about the OLS estimator having a faster convergence rate than $o_p(1)$. Hansen derives this after it obtains the rate of scaling needed for the asymptotic distribution. And since this latter is $O_p(n^{-1/2})$, it follows that multiplying the estimator $\hat \beta_n - \beta$ by something larger than unity ($n^0$) but lower than $n^{1/2}$ will not hinge his journey towards zero.

To clear the eye, we are examining the rate of convergence of

$$E(h_n) - c,\;\;\; c\; {\rm =\;constant}, \;\;\; h_n = O_p(1), \;h_n - c \to_p 0.$$

Now, to apply the Hansen approach, we would need to be able to say something about the distribution (if it exists) of $$n^{\delta} (h_n - c).$$

If we can prove that, for some $\delta >0$ the above converges in distribution, then we can apply the logic of Hansen, and argue that $\exists \, \gamma, 0<\gamma < \delta$ for which

$$n^{\gamma}(h_n - c) \to_p 0$$

and so

$$(h_n - c) = o_p(1/n^{\gamma}) \implies E(h_n -c) = o(1/n^{\gamma}).$$

This doesn't correspond to what I asked though. I am asking about the effect on $\text{Cov}[A|\mathbf{X}] = \text{Cov}[\hat \beta\mathbf{X}]$ of replacing $(\mathbf{X}\mathbf{X}/n)^{-1}$ with its probability limit $Q^{-1}$. My questions asks is such an approximation valid? What level of error is incurred, and can we account for the error? — sonicboom
– sonicboom, Commented Dec 4, 2020 at 16:56
I am only interested in the standard regression case when $E[\varepsilon|\mathbf{X}] = 0$. Your post doesn't mention anything about the probability limit $Q^{-1}$ which is they key point of my question. — sonicboom
– sonicboom, Commented Dec 4, 2020 at 19:23
@sonicboom That's Bruce Hansen's econometrics textbook. Let me have a look. — Alecos Papadopoulos
– Alecos Papadopoulos, Commented Dec 10, 2020 at 17:21
I have updated my answer to show what we need to apply Hansen approach in your case. I suggest you delete all these comments, the essence has by now been incorporated in my post. I am deleting my comments. — Alecos Papadopoulos
– Alecos Papadopoulos, Commented Dec 10, 2020 at 18:45
@sonicbom No, that definitely won't work. But If you write explicitly the $X'X$ matrix, it is comprised of sample means. Moreover, if I guess in the first column of $X$ you have a constant, then you can write $X$ and $X'X$ in blocks, and apply block-matrix inversion to obtain an explicit expression for the inverse. You will find that it includes sample means and so multiplied by $\sqrt{n}$ should lead to a distribution. This means that here too you end up having the Hansen result, namely you have room to improve the rate of convergence from $o_p(1)$ up to $o_p(1/n^{\delta}),\; \delta <1/2.$ — Alecos Papadopoulos
– Alecos Papadopoulos, Commented Dec 10, 2020 at 21:40

Stack Exchange Network

Validity of approximating a covariance matrix by making use of a probability limit?

1 Answer 1

Your Answer

Linked

Hot Network Questions

Validity of approximating a covariance matrix by making use of a probability limit?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Hot Network Questions