Covariance of Unbiased Sample Variance Estimators with Overlapping Samples

Question

Unbiased Variance Estimator

Let $x_1 , \ldots, x_N$ be iid sampled from X. Let Y(N) denote the N-mean estimator given by

$$ Y(N) = \frac{1}{N} \sum_{i=1}^N x_i $$

Let v(N) denote the unbiased N-variance estimator

$$ v(N) = \frac{1}{N-1} \sum_{i=1}^N (x_i - Y(N))^2 $$

Unbiased Variance Estimator with reused samples

I draw d additional samples $x_{N+1} , \ldots, x_{N+d}$ that are iid sampled from X. Similarly, let $Y(N+d)$ denote the $(N+d)$-mean estimator (with reused samples).

$$ Y(N+d) = \frac{1}{N+d} \sum_{i=1}^{(N+d)} x_i$$

Let v(N+d) denote the unbiased (N+d)-variance estimator with reused samples,

$$v(N+d) = \frac{1}{N+d-1} \sum_{i=1}^{(N+d)} (x_i - Y(N+d))^2$$

Covariance of the Variance Estimators with shared samples

My question is, what is the Covariance of these two estimators? $$\mathrm{Cov}[ v(N), v(N+d) ] = ??? $$

From E. Benhamou. “A few properties of sample variance" equation 10, I know that the variance of the estimators is given by $$\mathrm{Var}[ v(N) ] = \frac{1}{N}\left( m_4 - \frac{N-3}{N-1} m_2^2 \right)$$ where $m_2$ and $m_4$ are the second-order and fourth-order moments of X. I tried to use the fact that v is a U-statistic but I ended up with a quadruple summation that I wasn't sure how to simplify. Is this a standard result? Can I pull a citation from somewhere or is there something that I can draw from?

@NN2 Is there a standard reference that I can use for this particular result? I would rather not spend a massive amount of time deriving something which is well known. — mathematurgist
– mathematurgist, Commented 2 days ago
I don't have. I calculated something like this by myself in my work several years ago. You need to break down $Y(N)$ and rewrite $v(N)$ in term of $(x_i)_i$ only. — NN2
– NN2, Commented 2 days ago
I think that the approach with U-statistics can work. We can decompose the centered versions of $v(N)$ and $v(N+d)$ as sums of pairwise orthogonal random variables. I will try to find time to write it down these days. — Davide Giraudo
– Davide Giraudo, Commented 2 days ago
@NN2 thank you for the response. I took the time and I did the full breakdown. — mathematurgist
– mathematurgist, Commented yesterday

mathematurgist · Accepted Answer · 2025-11-29 22:32:29Z

We use the following formula: \begin{equation} \begin{aligned} \left( N+d-1 \right)v(N+d) = & \left( N-1 \right)v(N) +\sum_{i=N+1}^{N+d} \left( x_i - \frac{1}{d}\sum_{j=N+1}^{N+d} x_{j} \right)^2 \\ &+\frac{d N}{N+d}\left( Y(N)-\frac{1}{d}\sum_{j=N+1}^{N+d} x_{j} \right)^2 . \end{aligned} \end{equation}

Taking covariance with respect to (v(N)) and using bilinearity and independence: \begin{equation} \begin{aligned} \left( N+d-1 \right)\mathrm{Cov}[v(N+d),v(N)] = \left( N-1 \right)\mathrm{Var}[v(N)] +\frac{dN}{N+d}\mathrm{Cov}\left[ \left( Y(N)-\frac{1}{d}\sum_{j=N+1}^{N+d} x_{j} \right)^2, v(N) \right]. \end{aligned} \end{equation}

Expanding the square and using independence again: \begin{equation} \begin{aligned} \left( N+d-1 \right)\mathrm{Cov}[v(N+d),v(N)] = \left( N-1 \right)\mathrm{Var}[v(N)] \\+\frac{dN}{N+d}\left( \mathrm{Cov}[Y^2(N),v(N)] - 2\mu\,\mathrm{Cov}[Y(N),v(N)] \right). \end{aligned} \end{equation}

We isolate the covariance terms: \begin{equation} \begin{aligned} \mathrm{Cov}[Y^2(N),v(N)] &= \mathbb{E}[Y^2(N)v(N)] - \mathbb{E}[Y^2(N)]\mathbb{E}[v(N)] \\ &= \mathbb{E}[Y^2(N)v(N)] - \left( \frac{\sigma^2}{N}+\mu^2 \right)\sigma^2, \\ \mathrm{Cov}[Y(N),v(N)] &= \mathbb{E}[Y(N)v(N)] - \mathbb{E}[Y(N)]\mathbb{E}[v(N)] \\ &= \mathbb{E}[Y(N)v(N)] - \mu\sigma^2 . \end{aligned} \end{equation}

Substitute into the covariance expression: \begin{equation} \begin{aligned} \left( N+d-1 \right)\mathrm{Cov}[v(N+d),v(N)] &= \left( N-1 \right)\mathrm{Var}[v(N)] \\ &\quad+\frac{dN}{N+d} \left( \mathbb{E}[Y^2(N)v(N)]-2\mu \mathbb{E}[Y(N)v(N)] -\frac{\sigma^4}{N}+\mu^2\sigma^2 \right). \end{aligned} \end{equation}

Observe that \begin{equation} \mathbb{E}[Y^2(N)v(N)]-2\mu \mathbb{E}[Y(N)v(N)] = \mathbb{E}[(Y(N)-\mu)^2 v(N)] - \mu^2\sigma^2. \end{equation}

Thus \begin{equation} \begin{aligned} \left( N+d-1 \right)\mathrm{Cov}[v(N+d),v(N)] = \left( N-1 \right)\mathrm{Var}[v(N)] +\frac{dN}{N+d}\left( \mathbb{E}[(Y(N)-\mu)^2v(N)] - \frac{\sigma^4}{N} \right). \end{aligned} \end{equation}

Using the variance–mean decomposition: \begin{equation} (N-1)v(N) = \sum_{i=1}^N (x_i-Y(N))^2 = \sum_{i=1}^N ((x_i-\mu)-(Y(N)-\mu))^2, \end{equation} so \begin{equation} v(N)=\frac{1}{N-1}\sum_{i=1}^N (x_i-\mu)^2 -\frac{N}{N-1}(Y(N)-\mu)^2. \end{equation}

Therefore, \begin{equation} \mathbb{E}[(Y(N)-\mu)^2 v(N)] = \frac{1}{N-1}\mathbb{E}\left[(Y(N)-\mu)^2\sum_{i=1}^N (x_i-\mu)^2\right] -\frac{N}{N-1}\mathbb{E}[(Y(N)-\mu)^4]. \end{equation}

Compute the first expectation term: \begin{equation} \begin{aligned} \frac{1}{N-1}\mathbb{E}\left[(Y(N)-\mu)^2\sum_{i=1}^N (x_i-\mu)^2\right] &= \frac{1}{N^2(N-1)} \mathbb{E}\left[ \left( \sum_{j=1}^N (x_j-\mu) \right)^2 \sum_{i=1}^N (x_i-\mu)^2 \right]. \end{aligned} \end{equation}

Expanding the quadratic: \begin{equation} \begin{split} \left( \sum_{j=1}^N (x_j-\mu) \right)^2\sum_{i=1}^N (x_i-\mu)^2 &= \sum_{i=1}^N (x_i-\mu)^4 + \sum_{i\neq j}(x_i-\mu)^2(x_j-\mu)^2 \\ &\quad + 2\sum_{i\neq j}(x_i-\mu)^3(x_j-\mu) + 2\!\!\sum_{\substack{i\neq j \\ j\neq k \\ i\neq k}}\!(x_i-\mu)^2(x_j-\mu)(x_k-\mu). \end{split} \end{equation}

Since (\mathbb{E}[x_i-\mu]=0), cross-terms vanish, so \begin{equation} \begin{split} \mathbb{E}\left[\left( \sum_{j=1}^N (x_j-\mu) \right)^2\sum_{i=1}^N (x_i-\mu)^2\right] = N\mu_4 + N(N-1)\left( \sigma^2 \right)^2 . \end{split} \end{equation}

Next, expand the quartic term: \begin{equation} (Y(N)-\mu)^4 = \frac{1}{N^4}\sum_{i=1}^N(x_i-\mu)^4 +\frac{4}{N^4}\sum_{i<j}(x_i-\mu)^3(x_j-\mu) +\frac{6}{N^4}\sum_{i<j}(x_i-\mu)^2(x_j-\mu)^2. \end{equation}

Hence, \begin{equation} \mathbb{E}[(Y(N)-\mu)^4] = \frac{1}{N^3}\left( \mu_4 +3(N-1)\sigma^4 \right). \end{equation}

Thus, \begin{equation} \begin{split} \mathbb{E}[(Y(N)-\mu)^2v(N)] &= \frac{1}{N-1}\left( N\mu_4 + N(N-1)\sigma^4 \right) -\frac{N}{N-1}\left( \frac{\mu_4 +3(N-1)\sigma^4}{N^3} \right) \\ &= \frac{N^2+N+1}{N^2}\mu_4 + \frac{N^3-3}{N^2}\sigma^4. \end{split} \end{equation}

Finally, \begin{equation} \begin{aligned} \left( N+d-1 \right)\mathrm{Cov}[v(N+d),v(N)] &= \left( N-1 \right)\mathrm{Var}[v(N)] \\ &\quad + \frac{d}{N(N+d)} \left( (N^2+N+1)\mu_4 + (N^3-N-3)\sigma^4 \right). \end{aligned} \end{equation}

Using the variance–estimator expectation: \begin{equation} \begin{aligned} \left( N+d-1 \right)\mathrm{Cov}[v(N+d),v(N)] &= \left( N-1 \right)\left( \frac{1}{N}\left( \mu_4 - \frac{N-3}{N-1}\sigma^4 \right) \right) \\ &\quad+\frac{d}{N(N+d)} \left( (N^2+N+1)\mu_4 + (N^3-N-3)\sigma^4 \right) \\ &= \frac{(d+1)N + 2d -1}{N+d}\mu_4 + \frac{dN^2 - N - 2d + 3}{N+d}\sigma^4 . \end{aligned} \end{equation}

Stack Exchange Network

Covariance of Unbiased Sample Variance Estimators with Overlapping Samples

Unbiased Variance Estimator

Unbiased Variance Estimator with reused samples

Covariance of the Variance Estimators with shared samples

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Covariance of Unbiased Sample Variance Estimators with Overlapping Samples

Unbiased Variance Estimator

Unbiased Variance Estimator with reused samples

Covariance of the Variance Estimators with shared samples

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions