Energy Statistic (ES)

In the second case above, when ${\bf x}$ is not constrained to a compact set, and we are not willing to assume a fixed scale parameter, the condition (3.4) must be satisfied. To do this, we need to insure that $T({\bf x})$ contains an energy statistic [3]. An energy statistic is a statistic, usually scalar, that contains information about the norm (size) of ${\bf x}$. The energy statistic can be explicitly included as a component of ${\bf z}$, such as a sample variance, but does not need to be known explicitly. The important thing is that if ${\bf z}=T({\bf x})$ contains an energy statistic, then (3.4) is satisfied for some norm. A norm $\Vert{\bf x}\Vert$ must meet the properties of scalability ( $\Vert a{\bf x}\Vert=\vert a\vert \Vert{\bf x}\Vert$), triangle inequality ( $\Vert{\bf x}+{\bf y}\Vert\leq \Vert{\bf x}\Vert+\Vert{\bf y}\Vert$), and zero property $\Vert{\bf0}\Vert=0.$. Examples of norms are generalized sample moments

$\displaystyle \Vert{\bf x}\Vert= \left(\sum_{i=1}^N w_i \vert x_i\vert^p\right)^{1/p},
$

where $p\ge 1$ and $w_i>0$. The corresponding ES is, for example

$\displaystyle t({\bf x}) = \sum_{i=1}^N w_i \vert x_i\vert^p.$ (3.6)

By insuring that ${\bf z}$ contains an energy statistic, then for any fixed finite-valued feature value ${\bf z}$, $\Vert{\bf x}\Vert$ is fixed, therefore the manifold ${\cal M}({\bf z}^*)$, in (2.4), is compact. This is necessary to insure the MaxEnt property [3]. Also necessary for the maxEnt property is satisfying (3.5) for some function $h(\;)$, which means that $p({\bf x}\vert H_0)$ depends on ${\bf x}$ only through $T({\bf x})$. Given that $f({\bf z})$ exists, then it is easy to find a reference hypotheses $H_0$ that meets (3.5). One example can be written

$\displaystyle p({\bf x}\vert H_0) = \frac{1}{C} e^{-[f(T({\bf x}))]^p},$ (3.7)

for $p\ge 1$, and $C$ is the apropriate scale factor.

Example 6   Let ${\cal P}^N$ be the positive quadrant of ${\cal R}^N$, such that $x_i>0, \; \forall i$. The statistic

$\displaystyle t_1({\bf x})=\sum_{i=1}^N x_i$ (3.8)

leads to the 1-norm on ${\cal P}^N$ and can be paired with the exponential reference hypothesis

$\displaystyle p({\bf x}\vert H_0)=\prod_{i=1}^N e^{-x_i},$ (3.9)

where $f({\bf z})=t_1({\bf x}),$ and $p=1$.

Example 7   The statistic

$\displaystyle t_2({\bf x}) = \sum_{i=1}^N x_i^2.$ (3.10)

leads to the 2-norm on ${\cal R}^N$ and can be paired with the Gaussian

$\displaystyle p({\bf x}\vert H_0)=\prod_{i=1}^N \frac{1}{\sqrt{2\pi}} e^{-x_i^2/2},$ (3.11)

where $f({\bf z})=\sqrt{t_2({\bf x})}/2,$ and $p=2$.

Further examples of energy statistics and associated canonical reference hypotheses are provided in Table 3.1.

Interestingly, no matter which reference hypothesis meets (3.5) the resulting projected PDF is the same. So, if $p({\bf x}\vert H_0)=h(T({\bf x}))$, and $p({\bf x}\vert H^\prime_0)=h^\prime(T({\bf x}))$, then

$\displaystyle G({\bf x}; H_0,T,g)= G({\bf x}; H^\prime_0,T,g).$

We will explain this in Section 3.2.3.


Table: Reference PDFs and their energy statistics. The reference PDFs depend on the data only through the indicated energy statistics. Note that ${\cal P}^N$ is the positive quadrant of ${\cal R}^N$ where all elements of ${\bf x}$ are positive and ${\cal U}^N$ is the hypercube where all elements of ${\bf x}$ are in $[0,1]$.
Name Data range Ref. Hyp. $p({\bf x}\vert H_0)$ Energy Statistic
Gaussian ${\bf x}\in{\cal R}^N$ $\prod_{i=1}^N \; \frac{e^{-x_{i}^2/2}}{\sqrt{2\pi}}$ $t_2({\bf x})=\sum_{i=1}^N \; x_i^2 $
Laplacian ${\bf x}\in{\cal R}^N$ $\prod_{i=1}^N \; \frac{1}{\sqrt{2}} \; e^{-\sqrt{2} \; \vert x_{i}\vert}$ $t({\bf x})=\sum_{i=1}^N \; \vert x_i\vert$
Exponential ${\bf x}\in{\cal P}^N$ $\prod_{i=1}^N \; e^{-x_i}$ $t_1({\bf x})=\sum_{i=1}^N \; x_i $
$\chi^2(1)$ ${\bf x}\in{\cal P}^N$ $\prod_{i=1}^N \; \frac{e^{-x_{i}/2}}{\sqrt{2\pi x_i}}$ $\left[\begin{array}{l}
\sum_{i=1}^N \; \log x_i  \\
\sum_{i=1}^N \; x_i
\end{array}\right]$
Uniform ${\bf x}\in {\cal U}^N$ $1$ n/a