## Hanning-3 Segmentation

Block segmentation was necessary for strict application of PDF projection to segmented data. In 2012, a means was discovered to circumvent this restriction, and still obtain exact" projected likelihood comparison between differently-segmented branches, but it is only possible for 2/3 overlap, 3/4 overlap, etc. It is not possible for the commonly-used 50% overlap. Since 3/4 overlap and more is rarely used, we use 2/3 overlap and call the method Hanning-3. Consider overlapped segments with segment size and window time shift , overlapping by samples. If we circularly-index the data such that , we will obtain exactly segments. Let be the -th segment where are the Hanning weights. The Hanning weights must be periodic with a period of , not as is typically used to avoid any zero weights.

 (12.3)

where The special scaling factor is non-standard and results in the desired property that follows. Let the complete hanning-3 segmentation be denoted by Note that has a total dimension of , so has a total dimension of , three times as large as the dimension of .

To use various hanning-3 segmentations together in a class-specific classifier, we need to apply the concept of virtual input data. Consider two hanning-3 segmentations with different segment sizes and , denoted by and . It has been shown [49] that with weights as defined in (12.3), that and are related by an orthogonal linear transformation. Specifically, there exists an ortho-normal matrix such that In Figure 12.2, the output of each segmentation operation is considered as the virtual input data" of each branch. Each branch has a different virtual input data, but they are considered equivalent". Therefore, the projected likelihood function for may be compared to the projected the likelihood function for .

Each block feature calculation" in the figure normally consists of more than one stage, organized as a chain (See Section 2.2.4). The starting reference hypothesis ( in equation 2.9) is typically canonical reference hypothesis, exponential or Gaussian, in which all the elements of are independent. Thus, all the elements of are assumed independent under . Naturally, this assumption is false. How can this be good to base PDF projection and the comparison of the various branch projected PDFs on such an obviously false assumption?

Good question! Let's delve into this question further. Let's start off by saying that the reference hypothesis is there only as a mathematical tool for PDF projection. The PDF projection theorem guaranteed that regardless of the reference hypothesis, independent assumption or not, the resulting projected PDF, is a PDF, and is a member of the class of PDFs that generate the corresponding feature PDF . Furthermore, provided there is a corresponding energy statistic, it is the maximum entropy member of the class. The projected PDFs of all the branches are then PDFs in the same data space since using an orthonormal rotation, all can be converted to PDFs of a common data space.

With that said, what effect does using this false" independence assumption really have? Let's assume we are striving for sufficiency optimality (Section 2.2.2) in which the projected PDF can be equated to the true PDF. For the sufficiency condition to be met or approximated, the feature must be a sufficient statistic for distinguishing between the true PDF and . At the virtual input data, under the true" PDF of , which in this case, means when real data is used, the individual segments of are Hanning-weighted, and adjacent segments are highly correlated since of the data samples are the same. But, under , the samples in a segment of are assumed random with identical distributions, and adjacent segments are completely independent. There is generally no hope that the features are good to distinguish these two conditions. The most common first-stage signal processing is DFT followed by magnitude squared, then some kind of smoothing in which virtually all information relating to the Hanning weighting is lost.

The Hanning-3 expansion" that takes a size- input sample and increases it to dimension , is analogous to sampling rate increase through interpolation. Just as in interpolation, the original data can be recoverd using decimation as can be recovered from using overlap-add. And, analogous to feature extraction from 3:1 interpolated data, the suitability of the features must be evaluated with respect to the low-pass information, not with repect to the missing high-pass information. Therefore, although using a reference hypothesis that makes the false" assumption of independent data is very suspicious in the realm of virtual input data", it can be quite reasonable in the realm of real data.

The mathematical postulations of Hanning-3 can be tested using the function software/hanning3_wts.m, with syntax [w,W,A]=hanning3_wts(K,N);. The outputs include w, which is the weight vector , W which is the matrix of window functions, and A is the linear expansion matrix that creates from in one column. In Figure 12.3, we plotted W as an image for , for which there are segments, and for , for which there are segments. In the figure, you can see the circular indexing.

Let be the matrix A produced for the given value of . It is easy to verify in either case that

1. the product produces the concatenated segmentation ,
2. that is orthonormal, so that
3. that to transform between the segmentations and , we can use

where is an orthonormal transformation since it is the product of two orthonormal transformations.

Baggenstoss 2017-05-19