Algorithm Summary

We now briefly summarize the MR-HMM algorithm. The MR-HMM algorithm has three potential goals: (a) calculate $ L({\bf x};$$ \Lambda$$ )$, (b) determine the most likely path and its likelihood value, and (c) update the MR-HMM parameters $ \Lambda$ (re-estimation). We start with a set of initial MR-HMM parameters (Initialization is covered in Section 14.2.8). To re-estimate the discrete parameters of the proxy-HMM, we first use the standard Baum-Welch algorithm [57] on the proxy HMM to obtain the expanded proxy STM and prior state probabilities, then ``compress" them to obtain $ {\bf A}$ and $ \pi$. Compressing means squeezing the estimated proxy STM, which is $ N_p\times N_p$, to size $ N\times N$ by by summing up all the transitions from the end of any partition to the start of another partition according to state. The feature PDF estimates $ p({\bf z}_{k_s,m_s}\vert m_s)$ used in (14.4) can also be re-estimated in a straight-forward way using the gamma probabilities [65], analogous to the standard Baum-Welch algorithm [57]. The MR-HMM algorithm proceeds as follows:
  1. compute the segment likelihood values (see Section 14.2.5),
  2. Use these to compute the proxy HMM segment likelihood values on a base-segment basis (14.3),
  3. Create the proxy HMM STM and initial probabilities from the MR-HMM parameters in the manner described in Section 14.2.3,
  4. compute the forward procedure for the proxy HMM as described for the traditional HMM (eqs. (12-20) of [57]), producing the forward probabilities $ \alpha_{t,i}$,
  5. from $ \alpha_{t,i}$, calculate $ L({\bf x};$   $ \Lambda$$ )$ (eq. (20) of [57]). Or, if only classifying or segmenting data, compute the best trellis path $ {\bf q}^*$ using the Viterbi algorithm (Section III.B [57]). In place of $ L({\bf x};$   $ \Lambda$$ )$, we use the best path probability

    $\displaystyle L({\bf x}\vert {\bf q}^*) = \log p({\bf x}\vert{\bf q}^*) + \log p({\bf q}^*)$

    using (14.2) and equation (14) in [57] to compute $ p({\bf q}^*)$.
  6. If only $ L({\bf x};$   $ \Lambda$$ )$ or the best trellis path and likelihood is desired, then exit. If MR-HMM parameter re-estimation or a posteriori state probabilities are desired, then
  7. compute the backward probabilities $ \beta_{t,i}$ (eqs. (23-25) of [57]),
  8. Use $ \alpha_{t,i}$, $ \beta_{t,i}$ and the proxy STM to compute the a posteriori state probabilities $ \gamma_{t,i}$ (See eq. (27) in [57]),
  9. To compute the a posteriori probability of each sub-class (i.e. bottom of Figure 14.1), add up $ \gamma_{t,i}$ across all the partitions within the sub-class,
  10. Re-estimate the proxy STM and initial probabilities, (eq. (40) in [57]),
  11. Determine the MR-HMM sub-class transition matrix $ {\bf A}_{i,j}$, initial probabilities $ \pi_i$, and segment size probabilities $ \rho_{m,i}$ by averaging the proxy initial probabilities and STM in the obvious way,
  12. Update the segment PDF Gaussian mixtures. This process differs somewhat from the standard HMM because the proxy HMM has a bloated number of states, most of which are just wait-states. Only the proxy states to pay attention to are the first wait state of a partition (See Figure 14.1). A non-zero $ \gamma_{t,i}$ in these positions indicates the probability of a given sub-class, segment size, and time delay. The segment PDFs are updated as in eq. (52)-(54) in [57], with this sublety in mind.
Note that considerable (order of magnitude or more) memory and computational savings can be had by taking advantage of the highly sparse and structured proxy HMM state transition matrix in the above steps. This is all done by the CMEX software software/alphabetacompress.c.

Baggenstoss 2017-05-19