0
$\begingroup$

I have 7 variables $A_i$, $i\in\{-3,-2,-1,0,1,2,3\}$. ($A_{-1}$ and $A_1$) are identically distributed. ($A_{-2}$ and $A_2$) are identically distributed. ($A_{-3}$ and $A_{3}$) are identically distributed. These variables cannot be assumed to be independent. These variables are strictly positive. The distribution of the $A_i$ variables is unknown and is the result of complex many body interactions between thousands of atoms.

I have experimental estimates of the expected values, and the variance covariance matrix of the $A_i$ variables, that is I have estimates of $cov(A_i,A_j) \ \forall\{i,j\}$.

Let $A_T=\sum_iA_i$ and $A_t^j=A_t-A_j$

I want to compute estimates of the covariances between the normalized $A_i$ as a function of the estimates I already have. That is I want to compute

$$cov(\frac{A_i}{A_T},\frac{A_j}{A_T})$$

To do so I am trying to use the Delta method (Please have a look at this link for example). However I am not sure that the answer in this post is right.

In any case, by mimicking the answer in the link above, I define $$g(A_i,A_j,A_T)=\left(g_1,g_2\right)=\left(\frac{A_i}{A_T},\frac{A_j}{A_T}\right)$$ and try to write $ \nabla g$ mimicking the link above.

The first term of this gradient for example is given by $$\frac{\partial g_1}{\partial A_i}=\frac{1\times A_T-\partial A_T/\partial A_i\times A_i }{A_T^2}$$ To compute the value of $\frac{\partial A_T}{\partial A_i}$ I say the following:

$\sum_i\frac{\partial A_i}{\partial A_T}=\frac{\partial A_T}{\partial A_T}=1$ so that by symmetry (even though the $A_i$ variables are neither independent nor exchangeable) $\frac{\partial A_i}{\partial A_T}=\frac{1}{7}$ therefore $\frac{\partial A_T}{\partial A_i}=\frac{\partial A_i}{\partial A_i}+\sum_{j\neq i}\frac{\partial A_j}{\partial A_i}=1+\sum_{j\neq i}\frac{\partial A_j}{\partial A_T}\frac{\partial A_T}{\partial A_i}=1+\frac{6}{7}\frac{\partial A_T}{\partial A_i}$ which gives : $\frac{\partial A_T}{\partial A_i}=7$ and eventually

$$\frac{\partial g_1}{\partial A_i}=\frac{A_T-7 A_i }{A_T^2}$$

In particular $\frac{\partial A_i}{\partial A_j}=1$ according to the arguments above, which is weird.

All of this feels wrong, I am not sure of the meaning to give to derivatives with respect to random variables either which does not seem to be a problem in the post above.

Can anyone confirm or refute the validity of this derivation ? Can you give me a method to compute the covariance of normalized variables ? Even approximatively

$\endgroup$

1 Answer 1

1
$\begingroup$

The answer given is the original post is correct. Your mistake is that you are not computing the gradient correctly :

To make the notations clearer, denote $g : \vec x=(x,y,z) \mapsto (\frac{x}{z},\frac{y}{z}) = (g_1(\vec x),g_2(\vec x))$.

With this notation, you can see that $$\frac{\partial g_1}{\partial x} = \frac 1 z,\quad \frac{\partial g_1}{\partial z} = \frac{-x}{z^2},\quad \frac{\partial g_2}{\partial y} = \frac 1 z,\quad \frac{\partial g_2}{\partial z} =\frac{-y}{z^2}$$ Therefore the Jacobian of $g$ at point $\vec x = (x,y,z)$ is given by $$\nabla g(x,y,z) = \begin{pmatrix} \frac{1}{z} & 0 & -\frac{x}{z^2}\\ 0 & \frac{1}{z}& -\frac{y}{z^2} \end{pmatrix} $$ When you want to apply the delta method, you just need to plug in the (random) vector $\vec X =(A_i,A_j,A_T)$ as the input of $\nabla g$, which gives $$\nabla g(A_i,A_j,A_T) = \begin{pmatrix} \frac{1}{A_T} & 0 & -\frac{A_i}{A_T^2}\\ 0 & \frac{1}{A_T}& -\frac{A_j}{A_T^2} \end{pmatrix} $$ Now that you know $\nabla g(A_i,A_j,A_T)$, you can approximate your covariance matrix of interest $\operatorname{Cov}[g(A_i,A_j,A_T)]$ as $$\operatorname{Cov}[g(A_i,A_j,A_T)]\approx \nabla g(A_i,A_j,A_T)\operatorname{Cov}[(A_i,A_j,A_T)] \nabla g(A_i,A_j,A_T)^T $$ Which is just linear algebra.

$\endgroup$
5
  • $\begingroup$ Thank you @StratosFair, so you're saying that the fact that $A_i$, $A_j$and $A_T$ are not independent does not matter, right ? $\endgroup$ Commented Mar 29, 2022 at 14:26
  • 1
    $\begingroup$ Yes it doesn't matter. If they are independent then $\mathrm{Cov}[(A_i,A_j,A_T)]$ is the identity matrix and $\mathrm{Cov}[g(A_i,A_j,A_T)]$ is even easier to calculate, but it is not a problem if it is not the case. $\endgroup$ Commented Mar 29, 2022 at 14:29
  • $\begingroup$ Last question, the elements of the jacobian are in fact the expected values of the random variables ? Am I right ? @StratosFair $\endgroup$ Commented Mar 30, 2022 at 17:14
  • 1
    $\begingroup$ Yes, the "$A_i$" should actually refer to the empirical mean of your observations of $A_i$. In fact, to properly apply the delta method, the r.v.'s $A_i,A_j,A_T$ should be converging to some $A_i^*,A_j*,A_T^*$ that you can determine, then these would be the values you plug into the term $\nabla g$. If you don't know such $A_i^*,A_j*,A_T^*$, you can still suppose that your empirical means are close enough and plug them in instead. $\endgroup$ Commented Mar 30, 2022 at 18:45
  • $\begingroup$ :O so for example, if I follow you, for $\frac{1}{A_T}$ I should use $ \frac{1}{A_T^*}$ and not $\left(\frac{1}{A_T}\right)^*$ ? @StratosFair $\endgroup$ Commented Mar 31, 2022 at 9:14

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.