Under normal sitiations, it is highly unlikely, that the
data lies near the central peak of
It is far likelier that it is in the far tails.
This will cause the numerator and denominator
PDFs in (2.3)
to approach zero. Representing the PDF value may be well below
the machine precision. But, as long as the log of
be represented accurately, this is not an issue, as all calculations
of this sort can (and should) be done in the log domain.
As the reader will discover through
experience, as varies over a wide range of values,
there may be large variations in the individual terms, but by and large,
remains within fairly reasonable range of values.
The Fixed Reference Hypothesis and the PDF tail problem
We now consider the feature set pair consisting of
the sample mean and variance
In the above example, an analytic expression is available
. This is not the case in general.
The types of transformations and reference hypotheses
for which analytic expressions are available are limited.
We will see in the following sections how these problems can be alleviated.
Let be the hypothesis that is a set of
independent identically distributed
Gaussian samples with mean 0 and variance 1.
It is well known  that under
are statistically independent, so they can be treated
separately. Furthermore, under ,
is Gaussian with mean 0 and variance , thus
is a chi-square RV with degrees of freedom
derived from a zero-mean Normal distribution with variance
(See Section 16.1.2), thus
Finally, for the J-function we have
We simulated this example using . Data was created with mean
uniformly distributed between -5 and 5, and variance uniformly
distributed from 0 to 100. In ten random trials,
numerically underflowed (evaluated to zero)
nine of the ten times.
When the log-PDFs were evaluated instead, the log-PDF
values ranged across a wide range from -7000 to -629.
The log-J function ranged from -448 to -310.