Covariance of normalized variables

Question

I have 7 variables $A_i$, $i\in\{-3,-2,-1,0,1,2,3\}$. ($A_{-1}$ and $A_1$) are identically distributed. ($A_{-2}$ and $A_2$) are identically distributed. ($A_{-3}$ and $A_{3}$) are identically distributed. These variables cannot be assumed to be independent. These variables are strictly positive. The distribution of the $A_i$ variables is unknown and is the result of complex many body interactions between thousands of atoms.

I have experimental estimates of the expected values, and the variance covariance matrix of the $A_i$ variables, that is I have estimates of $cov(A_i,A_j) \ \forall\{i,j\}$.

Let $A_T=\sum_iA_i$ and $A_t^j=A_t-A_j$

I want to compute estimates of the covariances between the normalized $A_i$ as a function of the estimates I already have. That is I want to compute

$$cov(\frac{A_i}{A_T},\frac{A_j}{A_T})$$

To do so I am trying to use the Delta method (Please have a look at this link for example). However I am not sure that the answer in this post is right.

In any case, by mimicking the answer in the link above, I define $$g(A_i,A_j,A_T)=\left(g_1,g_2\right)=\left(\frac{A_i}{A_T},\frac{A_j}{A_T}\right)$$ and try to write $ \nabla g$ mimicking the link above.

The first term of this gradient for example is given by $$\frac{\partial g_1}{\partial A_i}=\frac{1\times A_T-\partial A_T/\partial A_i\times A_i }{A_T^2}$$ To compute the value of $\frac{\partial A_T}{\partial A_i}$ I say the following:

$\sum_i\frac{\partial A_i}{\partial A_T}=\frac{\partial A_T}{\partial A_T}=1$ so that by symmetry (even though the $A_i$ variables are neither independent nor exchangeable) $\frac{\partial A_i}{\partial A_T}=\frac{1}{7}$ therefore $\frac{\partial A_T}{\partial A_i}=\frac{\partial A_i}{\partial A_i}+\sum_{j\neq i}\frac{\partial A_j}{\partial A_i}=1+\sum_{j\neq i}\frac{\partial A_j}{\partial A_T}\frac{\partial A_T}{\partial A_i}=1+\frac{6}{7}\frac{\partial A_T}{\partial A_i}$ which gives : $\frac{\partial A_T}{\partial A_i}=7$ and eventually

$$\frac{\partial g_1}{\partial A_i}=\frac{A_T-7 A_i }{A_T^2}$$

In particular $\frac{\partial A_i}{\partial A_j}=1$ according to the arguments above, which is weird.

All of this feels wrong, I am not sure of the meaning to give to derivatives with respect to random variables either which does not seem to be a problem in the post above.

Can anyone confirm or refute the validity of this derivation ? Can you give me a method to compute the covariance of normalized variables ? Even approximatively

Stratos supports the strike · Accepted Answer · 2022-03-29 14:05:59Z

1

The answer given is the original post is correct. Your mistake is that you are not computing the gradient correctly :

To make the notations clearer, denote $g : \vec x=(x,y,z) \mapsto (\frac{x}{z},\frac{y}{z}) = (g_1(\vec x),g_2(\vec x))$.

With this notation, you can see that $$\frac{\partial g_1}{\partial x} = \frac 1 z,\quad \frac{\partial g_1}{\partial z} = \frac{-x}{z^2},\quad \frac{\partial g_2}{\partial y} = \frac 1 z,\quad \frac{\partial g_2}{\partial z} =\frac{-y}{z^2}$$ Therefore the Jacobian of $g$ at point $\vec x = (x,y,z)$ is given by $$\nabla g(x,y,z) = \begin{pmatrix} \frac{1}{z} & 0 & -\frac{x}{z^2}\\ 0 & \frac{1}{z}& -\frac{y}{z^2} \end{pmatrix} $$ When you want to apply the delta method, you just need to plug in the (random) vector $\vec X =(A_i,A_j,A_T)$ as the input of $\nabla g$, which gives $$\nabla g(A_i,A_j,A_T) = \begin{pmatrix} \frac{1}{A_T} & 0 & -\frac{A_i}{A_T^2}\\ 0 & \frac{1}{A_T}& -\frac{A_j}{A_T^2} \end{pmatrix} $$ Now that you know $\nabla g(A_i,A_j,A_T)$, you can approximate your covariance matrix of interest $\operatorname{Cov}[g(A_i,A_j,A_T)]$ as $$\operatorname{Cov}[g(A_i,A_j,A_T)]\approx \nabla g(A_i,A_j,A_T)\operatorname{Cov}[(A_i,A_j,A_T)] \nabla g(A_i,A_j,A_T)^T $$ Which is just linear algebra.

answered Mar 29, 2022 at 14:05

Stratos supports the strike

5,3271 gold badge12 silver badges32 bronze badges

$\begingroup$ Thank you @StratosFair, so you're saying that the fact that $A_i$, $A_j$and $A_T$ are not independent does not matter, right ? $\endgroup$

DarkBulle
– DarkBulle

2022-03-29 14:26:57 +00:00
Commented Mar 29, 2022 at 14:26
1

$\begingroup$ Yes it doesn't matter. If they are independent then $\mathrm{Cov}[(A_i,A_j,A_T)]$ is the identity matrix and $\mathrm{Cov}[g(A_i,A_j,A_T)]$ is even easier to calculate, but it is not a problem if it is not the case. $\endgroup$

Stratos supports the strike
– Stratos supports the strike

2022-03-29 14:29:45 +00:00
Commented Mar 29, 2022 at 14:29
$\begingroup$ Last question, the elements of the jacobian are in fact the expected values of the random variables ? Am I right ? @StratosFair $\endgroup$

DarkBulle
– DarkBulle

2022-03-30 17:14:33 +00:00
Commented Mar 30, 2022 at 17:14
1

$\begingroup$ Yes, the "$A_i$" should actually refer to the empirical mean of your observations of $A_i$. In fact, to properly apply the delta method, the r.v.'s $A_i,A_j,A_T$ should be converging to some $A_i^*,A_j*,A_T^*$ that you can determine, then these would be the values you plug into the term $\nabla g$. If you don't know such $A_i^*,A_j*,A_T^*$, you can still suppose that your empirical means are close enough and plug them in instead. $\endgroup$

Stratos supports the strike
– Stratos supports the strike

2022-03-30 18:45:05 +00:00
Commented Mar 30, 2022 at 18:45
$\begingroup$ :O so for example, if I follow you, for $\frac{1}{A_T}$ I should use $ \frac{1}{A_T^*}$ and not $\left(\frac{1}{A_T}\right)^*$ ? @StratosFair $\endgroup$

DarkBulle
– DarkBulle

2022-03-31 09:14:19 +00:00
Commented Mar 31, 2022 at 9:14

Add a comment |

Stack Exchange Network

Covariance of normalized variables

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Covariance of normalized variables

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions