In PDF projection, we assume that the PDF of the features
is known. Unless there is a need
to identify special hypotheses, it is normally denoted by
.
In practice,
is estimated from available training data,
or may be given.
If is a fixed-size vector,
can be
modeled as a Gaussian mixture (Section 13.2).
If consists of a sequence of feature vectors,
the PDF
must be the joint distribution
of the entire sequence, typically calculated under
a Markov assumption using a hidden Markov model
and the forward procedure (Section 13.3).
In practice, the input data can be segmented.
In this case, can represent a single segment or sample, or else
the entire data record, depending on the application.
Important is that we must be consistent, so if
represents the entire data record, so must represent
the collection of all feature vectors extracted from ,
and if represents one segment, must be the
feature extracted from that segment or sample.