Reference Hypothesis

For maximum entropy property, the module designer must choose a reference hypothesis and energy statistic according to the requirements of Section 3.2.2. Since ${\bf x}$ is positive valued, energy statistics may be formed by any weighted (with positive weights) sum of the samples of ${\bf x}$

$\displaystyle t_w({\bf x})= \sum_{i=1}^N w_i x_i,$ (5.3)

where $w_i>0$. This class of energy statistic then has the properties of a norm (See section 3.2.2) The canonical reference hypothesis corresponding to this energy statistic is then

$\displaystyle p({\bf x}\vert H_0)=\prod_{i=1}^N \; w_i e^{-w_i x_i}.$

Note that there is no reason to use another reference hypothesis since all reference hypotheses that depend on the data only through feature ${\bf z}$ will produce the same resultant projected PDF [3]. The only difference lies in the tractability of $p({\bf z}\vert H_0)$ and for this class, we provide a solution.

If the matrix A is already pre-defined, one must determine an energy statistic that is contained in ${\bf z}$ (i.e. determine the weights in (5.3). For example, if matrix ${\bf A}$ implements the DCT, then the first column is just a constant. This suggests the energy statistic

$\displaystyle t_1({\bf x})=\sum_{i=1}^N x_i.$

When in doubt and if matrix ${\bf A}$ can be modified, it may be reasonable to seek the energy statistic that computes the total energy of the source. Since ${\bf x}$ generally comes from the magnitude-squared output bins of an orthogonal transform, such as FFT, it may be useful to use the energy statistic

$\displaystyle t({\bf x})=\sum_{i=1}^N x_i/\rho^2_i,$

which computes the total energy of the source (input of the orthogonal transform). A special case of this is if matrix ${\bf A}$ implements the auto-correlation function (ACF), then the statistic

$\displaystyle t({\bf x})=x_1 + 2 \sum_{i=2}^{N-1} x_i + x_N$

is suggested (see Section 5.2.2).