Reference Hypothesis

In order to proceed with PDF projection, we need to have a reference hypothesis, which we have explained, provides a way of specifying the distribution of the unobservable dimensions of ${\bf x}$ . The reference hypothesis can be any statistical hypothesis under which the distribution of both ${\bf x}$ and ${\bf z}$ are known, and denoted by the pair $p({\bf x}\vert H_0)$ , $p({\bf z}\vert H_0)$ . Usually, $p({\bf x}\vert H_0)$ selected based on some general knowledge about the input data, such as value range (i.e. is the data positive, is it constrained to the interval [0, 1], is it unconstrained, but with a known variance). Under these general constraints, we select the distribution with highest entropy (see Chapter 3). This has a double benefit: selecting maximum entropy reduces bias in the selection of $p({\bf x}\vert H_0)$ and often results in a canonical distribution for which $p({\bf z}\vert H_0)$ can be derived.

Note that is a purely mathematical concept, and does not need to represent any type of “noise-only" condition or realistic data or such.

Other means of selecting $p({\bf x}\vert H_0)$ can be used. First, in Section 2.2.2, we will discuss how and $T({\bf x})$ can be selected jointly to approximate the condition of sufficiency, and thereby approximate as close as possbile the distribution of ${\bf x}$ under some arbitrary hypothesis .

For now, let's just assume that is given. Normally, $p({\bf z}\vert H_0)$ must be derived analytically. Only in rare situations can $p({\bf z}\vert H_0)$ can be estimated. We discuss type of problem in Section 2.2.3.