Figure 11.5 (left), shows the theoretical AR model log likelihood on the X-axis and the theoretical MFCC log likelihood on the Y-axis for 100 samples each of MFCC data (circles) and AR data (dots). For perfect performance, the data from each class should remain on the correct side of the X=Y line. A few errors can be seen. The Optimal classification error probability was determined to be 1.68% using 80,000 test samples.
Figure 11.5 (right) shows the experiment repeated using the projected PDFs using AR and MFCC features. It is difficult to see a difference between the two plots. To obtain a more quantitative result, we need to measure error probability in trials.
|
Next, we re-ran the experiment using a variety of training sample sizes, measuring classification performance. We compared the method with (a) Neyman-Pearson (optimal) classifier, (b) additive combination of the AR and MFCC feature log-likelihood functions, sometimes called “stacking", and (c) feature concatenation in which the union of the AR and MFCC features was formed. The same feature PDF estimation approach was used as for the feature density in PDF projection. We ran the experiments using both MFCC and MFCC-ML features. The results are shown in Figure 11.6 which shows the classification error probability in percent. For the left graph we used MFCC features, and for the right graph MFCC-ML features. After the optimal Neyman-Pearson classifier, PDF projection was best over-all, with MFCC-ML slightly better than MFCC. For MFCC features, feature concatenation showed about 35% more errors than PDF projection. For MFCC-ML, that ratio went up to about 50%. Likelihood stacking did much worse than feature concatenation, indicating that feature concatenation took advantage of the statistical dependence between the ML and AR features.
|