Experimental Procedure

We evaluated both likelihood function types (a) straight HMM on the un-augmented features, and (b) DAF-HMM that had been corrected by $K_{\mbox{\tiny $T$}}$, on each data set. For each data set and likelihood function type, we measured mean log-likelihood and classification error rate. Let

$\mu$$\displaystyle _m=\frac{1}{N_m} \sum_{k=1}^{N_m} \frac{\log L_m({\bf X}_k)}{T_k},$

where $L_m({\bf X})$ is a likelihood function for class $m$, $N_m$ is the number of testing samples for class $m$, and $T_k$ is the length of the feature stream for sample $k$. We only evaluated a likelihood function on data from it's own class. We assume that the $N_m$ testing samples have been separated from the training data used to train $L_m({\bf X})$. To separate the data, we trained on half of the available samples, then determined $\mu$$_m$ on the other half. We then switched the halves and avaraged the results. We also evaluated the classification error rate in percent for each likelihood function type, using the same data separation.