Using the Cholesky Decomposition of $ \Sigma$$ _i$.

Instead of computing $ \Sigma$$ _i$ directly, we store the Cholesky decomposition of $ \Sigma$$ _i$ computed using the QR decomposition. Consider a matrix of column vectors $ {\bf X}=[{\bf x}_1,{\bf x}_2, \ldots,{\bf x}_N]$. These columns correspond to the vectors $ ({\bf z}_k -$   $ \mu$$ _{i})$ in Table 13.1. A covariance estimate is obtained by forming the matrix

$\displaystyle {\bf\Sigma}=\frac{{\bf X} {\bf W} {\bf X}^\prime}{\displaystyle \sum_k w_k}
= \frac{1}{S}\tilde{\bf X} \tilde{\bf X}^\prime,$

where $ {\bf W}$ is the diagonal matrix formed from data weights $ w_k$, $ S=\sum_k w_k$, and $ \tilde{\bf X} = {\bf X} {\bf W}^{1/2}$. It may be verified that this is the same as computing the elements of $ {\bf\Sigma}$ as follows:

$\displaystyle {\bf\Sigma}_{ij}=\frac{1}{S} \sum_{k=1}^N \; x_{ki} \; x_{kj} \; w_k.
$

But note that if you take the QR decomposition $ \tilde{\bf X}^\prime = {\bf Q R}$, that

$\displaystyle {\bf\Sigma} = \frac{1}{S}\tilde{\bf X} \tilde{\bf X}^\prime =
\f...
...R}^\prime {\bf Q}^\prime {\bf Q} {\bf R} =
\frac{1}{S}{\bf R}^\prime {\bf R}.
$

Thus, we see that the QR decomposition of $ \tilde{\bf X}^\prime$ is related to the Cholesky factor of $ {\bf\Sigma}$. There is no reason to ever compute $ {\bf\Sigma}$ explicitly. Computing $ {\bf\Sigma}$ requires twice the number of bits of precision as $ {\bf R}$. A quadratic form can be computed using $ {\bf R}$ as follows:

$\displaystyle {\bf z}^\prime {\bf\Sigma}^{-1} {\bf z} = \Vert{\bf y}\Vert^2
$

where

$\displaystyle {\bf y} = {\bf z}^\prime {\bf R}^{-1}.
$

This convention is used in the software ( software/gmix_step.m). More precisely, the matrix tmpidx stores $ {\bf X}^\prime$ where the rows of $ {\bf X}^\prime$ are $ ({\bf z}_k -$   $ \mu$$ _{i})$. The QR decomposition of tmpidx is $ {\bf R}$, which is stored as a parameter. The subroutine for computing $ \log {\cal N}({\bf z}_k,$$ \mu$$ _i,$$ \Sigma$$ _i)$ is lqr_eval.m. This routine inputs $ {\bf z}_1,\ldots,{\bf z}_N$, $ \mu$$ _i$, and $ {\bf R}_i$. The mixture (13.1) is implemented by subroutine lqr_evp.m.

Baggenstoss 2017-05-19