Data Normalization

An application of the floating reference hypothesis is normalization of data prior to evaluation of the SPA for a fixed reference hypothesis. Suppose we would like to evaluate

$\displaystyle J({\bf x};H_0,T)= {p_x({\bf x}\vert H_0) \over p_z({\bf z}\vert H_0)},$ (2.16)

for arbitrary vectors ${\bf x}$. Let $v$ be an estimate of the scale of ${\bf x}$, either a sample variance, sample mean, or standard deviation estimate. Important is that as ${\bf x}$ is scaled, all the elements of ${\bf z}$ will vary in proportion to $v$:

$\displaystyle {\bf z}\sim v \frac{{\bf z}}{\Vert{\bf z}\Vert}.$

Let $H_v$ be a reference hypothesis that depends on $v$. As long as the feature ${\bf z}$ contains $v$, or $v$ can be computed from ${\bf z}$, then $H_v$ remains in the ROS of ${\bf z}$. Thus, equation (2.13) is theoretically independent of $H_v$,

$\displaystyle {p_x({\bf x}\vert H_0) \over p_z({\bf z}\vert H_0)}
={p_x({\bf x}\vert H_v) \over p_z({\bf z}\vert H_v)}.
$

If the elements of ${\bf z}$ are linearly related to $v$, we may write

$\displaystyle p_z({\bf z}\vert H_v) = v^{-D} \; p_z(v^{-1} {\bf z}\vert H_0),
$

where $D$ is the dimension of ${\bf z}$. Therefore,

$\displaystyle J({\bf x};H_0,T)= v^D \; {p_x({\bf x}\vert H_v) \over p_z({\bf z}/v\vert H_0)},$ (2.17)

which provides a convenient way to normalize ${\bf z}$ prior to calculating the SPA.