0
$\begingroup$

I would like to obtain the expectation and variance of the squared Pearson sample correlation ($\operatorname{E}(R_{lk}^2)$ and $V(R_{lk}^2)$) between two random variables $l$ and $k$ following a bivariate normal distribution.

I have been trying to trace back the results of a paper, stating that:

\begin{align}\operatorname{E}(\tilde{R}_{lk}^2)\approx R_{lk}^2 + \frac{(1 - \tilde{R}_{lk}^2)}{N}&\end{align} $\tag{Eq. 1a}$

and thus that:

\begin{align}\operatorname{E}(R_{lk}^2)\approx \tilde{R}_{lk}^2 - \frac{(1 - \tilde{R}_{lk}^2)}{N}&\end{align} $\tag{Eq. 1b}$

where $R_{lk}^2$ is the true squared correlation, $\tilde{R}_{lk}^2$ is the observed squared sample correlation and $N$ is the sample size. They claim that this result can be obtained with the $delta-method$. My understanding of the delta-method is that one can obtain an approximate expression for the Variance of a function of a random variable, such that:

\begin{align}Var(f(X))={({f'(\mu)})^2}Var(X)&\end{align} $\tag{Eq. 2a}$

Any idea how to get these results?

Derive Expected Squared Correlation Using Delta Method

Using the delta-method, I came out with this solution, but it is different from the first two equations I showed. In applying the delta-method, consider that $X=\tilde{R}$, $f(X)=\tilde{R}^2$, $f(\mu)=R_{lk}^2$ and $f'(\mu)=2R_{lk}$, where $R_{lk}$ is the true Pearson correlation.

\begin{align}Var(\tilde{R}_{lk}^2) = {({f'(R_{lk})})^2}Var(\tilde{R}_{lk})&\end{align} $\tag{Eq. 2b}$

Since $Var(\tilde{R}_{lk})$ can be interpreted as the the squared standard error of the sample correlation (Derivation of the standard error for Pearson's correlation coefficient):

\begin{align}Var(\tilde{R}_{lk}) = \frac{(1 - \tilde{R}_{lk}^2)}{N - 2}&\end{align} $\tag{Eq. 3}$

we do get:

\begin{align}Var(\tilde{R}_{lk}^2) = 4R_{lk}^2\frac{(1 - \tilde{R}_{lk}^2)}{N - 2}&\end{align} $\tag{Eq. 4}$

Noting that $Var(\tilde{R}_{lk}^2) = \operatorname{E}(\tilde{R}_{lk}^2)-\operatorname{E}(\tilde{R}_{lk})^2$, and the $\operatorname{E}(\tilde{R}_{lk})=0$, then we obtain:

\begin{align}\operatorname{E}(\tilde{R}_{lk}^2) = 4R_{lk}^2\frac{(1 - \tilde{R}_{lk}^2)}{N - 2}&\end{align} $\tag{Eq. 5a}$

Derive Expected Squared Correlation Using Taylor Expansion

Instead, if I use the Taylor expansion I do get the same results as the paper. The general formula for the Taylor expansion applied to this case is (as in Expected value of inverse?):
\begin{align}\operatorname{E}(\tilde{R}_{lk}^2) = R_{lk}^2+2 \frac{1}{2} Var(\tilde{R_{lk}})&\end{align}$\tag{Eq. 6}$ and again, using the standard error $Var(\tilde{R}_{lk})$ as expressed above, we get: \begin{align}\operatorname{E}(\tilde{R}_{lk}^2) = R_{lk}^2+2 \frac{1}{2} \frac{(1 - \tilde{R}_{lk}^2)}{N - 2}&\end{align} $\tag{Eq. 6a}$

which by rearranging we get:

\begin{align}\operatorname{E}(R_{lk}^2) = \tilde{R}_{lk}^2-\frac{(1 - \tilde{R}_{lk}^2)}{N - 2}&\end{align} $\tag{Eq. 6b}$

Does it seem correct to you? Note that I do get the same result if I use the law of total variance (not shown here).

Derive the Variance of the Squared Correlation Using Delta Method

Regarding the variance of the sample squared correlation $Var(\tilde{R}_{lk}^2)$, I can apply the delta-method:

\begin{align}Var(\tilde{R}_{lk}^2) = {({f'(R_{lk})})^2}Var(\tilde{R}_{lk})&\end{align}$\tag{Eq. 7}$

with the variance of the true correlation coefficient $R_{lk}$:

\begin{align}Var(\tilde{R}_{lk}) = \frac{(1 - {R}_{lk}^2)^2}{N}&\end{align}$\tag{Eq. 8}$

and get

\begin{align}Var(\tilde{R}_{lk}^2) = {4R_{lk}^2} \frac{(1 - {R}_{lk}^2)^2}{N}&\end{align} $\tag{Eq. 9}$

Ultimately we could replace $R_{lk}^2$ in Eq. 9 with the expression in Eq. 6b, and obtain a formula for the variance in sample squared correlation in terms of squared sample correlation ($\tilde{R}_{lk}^2$), instead of the true squared correlation ($R_{lk}^2$).

$\endgroup$
13
  • $\begingroup$ You lost me early on, because I do not understand how "two draws each" give you anything other than a sample correlation of $1,$ $-1,$ or undefined. Could you describe your random variables more clearly and explicitly? $\endgroup$ Commented May 11, 2024 at 0:18
  • $\begingroup$ Just edited my question with an explicit description of the two random variables. $\endgroup$ Commented May 11, 2024 at 0:35
  • 1
    $\begingroup$ 1. I don't see how the two binomial variates are correlated at all. 2. You have the expectation of a random variable (observed squared sample correlation) being a function of the random variable itself, which doesn't make sense. $\endgroup$ Commented May 11, 2024 at 0:38
  • $\begingroup$ I suppose that this correspond to some adjustement, similar to what is done for the Pearson correlation: en.wikipedia.org/wiki/Pearson_correlation_coefficient at the section "Adjusted correlation coefficient". $\endgroup$ Commented May 11, 2024 at 0:42
  • $\begingroup$ The standard error of the sample correlation coefficient is approximately $\sqrt{(1-r^2)/N}$, so the variance is just $(1-r^2)/N$, not $(1-r^2)^2/N$ as you have in eq. 8. Note that the answer you linked to below as a source got -1 votes and starts out with "I do not know the answer but for me there is an error in the formula...", not encouraging! Look at the Wikipedia page en.wikipedia.org/wiki/Pearson_correlation_coefficient under "Standard Error" for more information. Note also that your equation 6b is not correct. $\endgroup$ Commented May 14, 2024 at 22:30

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.