2
$\begingroup$

Suppose that $\mathbf{x}$ is normally distributed, $\mathbf{x} \sim \mathcal{N}(\boldsymbol \mu, \boldsymbol \Sigma)$. Under the transformation $\mathbf{h} = \boldsymbol\Sigma^{-1} \mathbf{x}$, we recover the canonical parametrization (also called the "information form") $$\mathbf{h} \sim \mathcal{N}(\boldsymbol \Sigma^{-1} \boldsymbol \mu, \boldsymbol\Sigma^{-1})$$ Considering the partition of $\mathbf{x}$ $$ \begin{bmatrix} \mathbf{x_1} \\ \mathbf{x_2} \end{bmatrix} \sim \mathcal{N}\left( \begin{bmatrix}\boldsymbol \mu_1 \\ \boldsymbol \mu_2\end{bmatrix}, \begin{bmatrix} \boldsymbol \Sigma_{11} && \boldsymbol \Sigma_{12} \\ \boldsymbol \Sigma_{21} && \boldsymbol \Sigma_{22} \end{bmatrix} \right) $$ a standard result (see below for references) is the $\mathbf{x_1}$ marginal of $\mathbf{h}$ has distribution $$ \mathcal{N}\left( \boldsymbol\eta_1 - \boldsymbol \Lambda_{12}\boldsymbol\Lambda_{22}^{-1}\boldsymbol \eta_2, \boldsymbol \Lambda_{11} - \boldsymbol \Lambda_{12} \boldsymbol \Lambda_{22}^{-1} \boldsymbol \Lambda_{12} \right) $$ where we have defined $$ \begin{bmatrix}\boldsymbol \eta_1 \\ \boldsymbol \eta_2\end{bmatrix} = \boldsymbol \Sigma^{-1}\begin{bmatrix} \boldsymbol \mu_1 \\ \boldsymbol \mu_2 \end{bmatrix} \hspace{3em} \begin{bmatrix} \boldsymbol \Lambda_{11} && \boldsymbol \Lambda_{12} \\ \boldsymbol \Lambda_{21} && \boldsymbol \Lambda_{22} \end{bmatrix} = \begin{bmatrix} \boldsymbol \Sigma_{11} && \boldsymbol \Sigma_{12} \\ \boldsymbol \Sigma_{21} && \boldsymbol \Sigma_{22} \end{bmatrix}^{-1} $$

It turns out that this is equivalent to starting with the marginal distribution $$ \mathbf{x_1} \sim \mathcal{N}\left(\boldsymbol \mu_1, \boldsymbol \Sigma_{11} \right) $$ and applying the same transformation $\boldsymbol \Sigma_{11}^{-1}\mathbf{x_1} \sim \mathcal{N}\left(\boldsymbol \Sigma_{11}^{-1} \boldsymbol \mu_1, \boldsymbol \Sigma_{11}^{-1} \right) $. In particular, we have \begin{gather*} \boldsymbol \Sigma_{11}^{-1} \boldsymbol \mu_1 = \boldsymbol\eta_1 - \boldsymbol \Lambda_{12}\boldsymbol\Lambda_{22}^{-1}\boldsymbol \eta_2 \\ \boldsymbol \Sigma_{11}^{-1} = \boldsymbol \Lambda_{11} - \boldsymbol \Lambda_{12} \boldsymbol \Lambda_{22}^{-1} \boldsymbol \Lambda_{12} \end{gather*}

My question:

Are there any other classes of transformations that enjoy the same equivalence? In particular, the commutativity of transformation and marginalization operations here doesn't seem trivial.

Off the top of my head, another example that works is diagonal matrices $\mathbf{h} = \text{diag}(\alpha_1, ...,\alpha_n)\mathbf{x}$.

References:

[1]: https://www.seas.upenn.edu/~cis520/papers/Bishop_2.3.pdf (pg. 89)

[2]: https://people.eecs.berkeley.edu/~jordan/courses/260-spring10/other-readings/chapter13.pdf (pg. 6)

$\endgroup$

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.