Manual Labeling

First, we'll use the manual labeling tool. First run the script with the selections:
and look at the manual labels that are pre-loaded. Or, change iseed and try labeling a new data set. Labeling is done by clicking (either in the time-series display or in the spectrogam dislay) the start and end of each segment. Typing 'h' in the labeling tool gives a list of the short-cut keys. The prompt asks for each segment signal type, and you can cycle through the segment signal types using 'j', 'k'. When finished, type 'q' and the labels will be saved to a file. Using 'x' instead exits without changing the labeling. When running again, use manual_label=1 to load the labels from a file.

Following what was presented in Section 14.4.2, we perform manual labeling as follows:

   subclassnames={'noise'  'Pulse1'  'noise' 'Pulse2'   };
   NFFT=64; % FFT size for spectrogram display
      load Idx.mat Idx
      save Idx.mat Idx
Note that the flag manual_label if set to 2 will attempt to load pre-existing manual labels from a file so you don't have to re-label each time you run the program. Obviously that this can only work if the same input data is used for labeling. In software/mrhmm_test1.m, we insure this by setting the random seeds to a fixed seed before running software/mrhmm_testdata.m (see top of software/mrhmm_test1.m ). A set of manual labels are shipped with the distribution (Idx.mat) that were created under the selected random seed.

When we are manually labeling, we typically exercise more control over the structure of the MR-HMM. We can determine the identity of the signal classes, and the states. Because the data consists of noise, two different pulse types, and a gap between the pulses, there are four states. But, the gap and the noise are actually the same signal class, so there are only three signal classes. In this case, when calling software/mrhmm_initialize.m, we need to use the additional inout variables class_feat_map and state_to_subclass:

   subclassnames={'noise'  'Pulse1'  'Pulse2'   };
   class_feat_map={}; %allow all subclasses to access all features
    state_to_subclass=[1 2 1 3];
Note that we have re-specified subclassnames to include only the unique subclass names. Above, when calling the labeling tool, we used 'noise' twice for convenience when using the labeling tool. We also can exert control over the state transitions:
    A_mask=[1 1 1 1;
            0 1 1 1;
            0 0 1 1;
            1 0 0 1];
    Pi_mask=[1 0 0 0];
    beta_end= [ 1 0 0 0];
This insures that the states start in the desired noise class, transition in the desired order, and end with noise.

We next get a set of MR-HMM parameters:

 % get an initial set of param
    [hparm,Icls]=mrhmm_initialize(Z,K,Ns,shfts,func_str,inv_str,Idx,ntot,pdf_type, ...
The Icls output, recall from Section 14.4.2, contains a re-formatted labeling.

Use the following to train with forced alignment:

    hparm.kbonus = 0;
    hparm.mdl_segmax = 1;
    use_viterbi = 0; 
    [hparm,lptot]=mrhmm_iterate(hparm,Z, J,Icls,ntot,num_iter,use_viterbi,iplot,X);

If iplot is set to 1 during training, a visualization of the training process is displayed. As the training runs, you should see a display similar to Figure 14.6.

Figure 14.6: MR-HMM in operation
Let's go over each sub-graph. On the top is obviously the spectrogram. The second graphic shows the forcing values that are applied to the partial PDF values, and are in the same format as the proxy partitions, as seen in Figure 14.3, center. Whenever a partition (a horizontal band in the center graph of Figure 14.3) is indicated by the labeling, the corresponding horizontal row in the second graphic in Figure 14.6 is illuminated. Note that the example of Figure 14.6 has 27 partitions, whereas Figure 14.3 has just 9 partitions, so there is no direct correspondence between the figures. The next graphic in the Figure shows the state probabilities. You should end up with a very crisp indication of state, especially because the states are being forced by the labeling. The lower left panel of the figure shows the Partition distributions as a function of state. You can see that the Pulse1 (state 2) and Pulse2 (state 4) states each prefer a particular segment size, and this is understandable since they have a relatively stable pulse length.

It is also expected that 'noise' (state 1) and 'gap' (state 3), which are both tied to the noise signal class, have different partition distributions. But, what might seem odd is that state 1, which is generally on for a long time, is relatively broadly distribuited, having a relatively high probability of short segments. It is often the case that more segments fit the data better. This is the old problem of over-fitting, whereby as more parameters are added to a model, the fit becomes better even if using more parameters is not prudent. Poor parameter balancing contributes to the problem, a situation that exists in our example. Note that the specification of Ps above is such that the AR model order is not proportional to the segment size. Therefore, if many smaller segments are used, there are a much larger agregate number of parameters. This will favor having many smaller segments. There are two ways to counter-act this effect.

You can re-run the example with kbonus= -5 or with hparm.mdl_segmax = 500 (See Section 14.4.3) and will see a reduction in the probability of smaller segments.

You can also try MFCC features (use_ar=0). For both AR and MFCC features, we recommend

    hparm.kbonus = -2.5;
    hparm.mdl_segmax = 1;

After the forced alignment training completes, training continued without alignment by setting the input argument Icls to the empty set:

    % train without forced alignment
    [hparm,lptot]=mrhmm_iterate(hparm,Z, J,{},ntot,num_iter,use_viterbi,iplot,X);
If everything works well, the 'crisp' state assignment probabilities will persist. This is an indication of good training.

Baggenstoss 2017-05-19