Data Normalization

An application of the floating reference hypothesis is normalization of data prior to evaluation of the SPA for a fixed reference hypothesis. Suppose we would like to evaluate

$\displaystyle J({\bf x};H_0,T)= {p_x({\bf x}\vert H_0) \over p_z({\bf z}\vert H_0)},$ (2.16)

for arbitrary vectors $ {\bf x}$. Let $ v$ be an estimate of the scale of $ {\bf x}$. In practice, $ v$ is a sample variance, sample mean, or standard deviation estimate. Important is that as $ {\bf x}$ is scaled, all the elements of $ {\bf z}$ will vary in proportion to $ v$. Let $ H_v$ be a reference hypothesis that depends on this scaling. As long as the feature $ {\bf z}$ contains $ v$, or $ v$ can be computed from $ {\bf z}$, then $ H_v$ remains in the ROS of $ {\bf z}$. Thus, equation (2.13) is theoretically independent of $ H_v$,

$\displaystyle {p_x({\bf x}\vert H_0) \over p_z({\bf z}\vert H_0)}
={p_x({\bf x}\vert H_v) \over p_z({\bf z}\vert H_v)}.
$

If the elements of $ {\bf z}$ are linearly related to $ v$, we may write

$\displaystyle p_z({\bf z}\vert H_v) = v^{-D} \; p_z(v^{-1} {\bf z}\vert H_0),
$

where $ D$ is the dimension of $ {\bf z}$. Therefore,

$\displaystyle J({\bf x};H_0,T)= v^D \; {p_x({\bf x}\vert H_v) \over p_z({\bf z}/v\vert H_0)},$ (2.17)

which provides a convenient way to normalize $ {\bf z}$ prior to calculating the SPA.

Baggenstoss 2017-05-19