The standard HMM

Following the notation of Rabiner [65], there are $T$ observation times. At each time $1\leq t \leq T$, there is a discrete state variable $q_t$ which takes one of $N$ values $q_t\in\{S_1,S_2,\cdots, S_N\}$. According to the Markovian assumption, the probability distribution of $q_{t+1}$ depends only on the value of $q_t$. This is described compactly as a state transition probability matrix $A$ whose elements $a_{ij}$ represent the probability that $q_{t+1}$ equals $j$ given that $q_{t}$ equals $i$. The initial state probabilities are denoted $\pi_i$, the probability that $q_1$ equals $S_i$.

It is a hidden Markov model because the states $q_t$ are hidden from view; we cannot observe them. But, we can observe the random data $O_t$ which is generated according to a PDF dependent on the state at time $t$. We denote the PDF of $O_t$ under state $j$ as $b_{j}(O_t)$.

The complete set of model parameters that define the HMM are

$\displaystyle \Lambda = \{\pi_j, a_{ij}, b_j \}
$

The Baum-Welch algorithm calculates new estimates $\Lambda$ given an observation sequence ${\bf O}=O_1 O_2\cdots O_T$ and a previous estimate of $\Lambda$. The algorithm is composed of two parts: the forward/backward procedure, and the reestimation of parameters.



Subsections