Suppose that $\mathbf{x}$ is normally distributed, $\mathbf{x} \sim \mathcal{N}(\boldsymbol \mu, \boldsymbol \Sigma)$. Under the transformation $\mathbf{h} = \boldsymbol\Sigma^{-1} \mathbf{x}$, we recover the canonical parametrization (also called the "information form") $$\mathbf{h} \sim \mathcal{N}(\boldsymbol \Sigma^{-1} \boldsymbol \mu, \boldsymbol\Sigma^{-1})$$ Considering the partition of $\mathbf{x}$ $$ \begin{bmatrix} \mathbf{x_1} \\ \mathbf{x_2} \end{bmatrix} \sim \mathcal{N}\left( \begin{bmatrix}\boldsymbol \mu_1 \\ \boldsymbol \mu_2\end{bmatrix}, \begin{bmatrix} \boldsymbol \Sigma_{11} && \boldsymbol \Sigma_{12} \\ \boldsymbol \Sigma_{21} && \boldsymbol \Sigma_{22} \end{bmatrix} \right) $$ a standard result (see below for references) is the $\mathbf{x_1}$ marginal of $\mathbf{h}$ has distribution $$ \mathcal{N}\left( \boldsymbol\eta_1 - \boldsymbol \Lambda_{12}\boldsymbol\Lambda_{22}^{-1}\boldsymbol \eta_2, \boldsymbol \Lambda_{11} - \boldsymbol \Lambda_{12} \boldsymbol \Lambda_{22}^{-1} \boldsymbol \Lambda_{12} \right) $$ where we have defined $$ \begin{bmatrix}\boldsymbol \eta_1 \\ \boldsymbol \eta_2\end{bmatrix} = \boldsymbol \Sigma^{-1}\begin{bmatrix} \boldsymbol \mu_1 \\ \boldsymbol \mu_2 \end{bmatrix} \hspace{3em} \begin{bmatrix} \boldsymbol \Lambda_{11} && \boldsymbol \Lambda_{12} \\ \boldsymbol \Lambda_{21} && \boldsymbol \Lambda_{22} \end{bmatrix} = \begin{bmatrix} \boldsymbol \Sigma_{11} && \boldsymbol \Sigma_{12} \\ \boldsymbol \Sigma_{21} && \boldsymbol \Sigma_{22} \end{bmatrix}^{-1} $$
It turns out that this is equivalent to starting with the marginal distribution $$ \mathbf{x_1} \sim \mathcal{N}\left(\boldsymbol \mu_1, \boldsymbol \Sigma_{11} \right) $$ and applying the same transformation $\boldsymbol \Sigma_{11}^{-1}\mathbf{x_1} \sim \mathcal{N}\left(\boldsymbol \Sigma_{11}^{-1} \boldsymbol \mu_1, \boldsymbol \Sigma_{11}^{-1} \right) $. In particular, we have \begin{gather*} \boldsymbol \Sigma_{11}^{-1} \boldsymbol \mu_1 = \boldsymbol\eta_1 - \boldsymbol \Lambda_{12}\boldsymbol\Lambda_{22}^{-1}\boldsymbol \eta_2 \\ \boldsymbol \Sigma_{11}^{-1} = \boldsymbol \Lambda_{11} - \boldsymbol \Lambda_{12} \boldsymbol \Lambda_{22}^{-1} \boldsymbol \Lambda_{12} \end{gather*}
My question:
Are there any other classes of transformations that enjoy the same equivalence? In particular, the commutativity of transformation and marginalization operations here doesn't seem trivial.
Off the top of my head, another example that works is diagonal matrices $\mathbf{h} = \text{diag}(\alpha_1, ...,\alpha_n)\mathbf{x}$.
References:
[1]: https://www.seas.upenn.edu/~cis520/papers/Bishop_2.3.pdf (pg. 89)
[2]: https://people.eecs.berkeley.edu/~jordan/courses/260-spring10/other-readings/chapter13.pdf (pg. 6)