There can be no doubt that the need is increasing
for better, more accurate, and
more meaniful statistical models for high-dimensional data.
From machine learning to data mining,
to big data analysis, data dimension is increasing.
Classical decision theory was unable to cope with the
increasing dimension because it required knowing or estimating
the likelihood functions for each class assumption.
Learning theory  argues that
classical decision theory needlessly solves the more general problem
of PDF estimation, so it is better to estimate the
class posterior probabilities directly.
This idea gave rise to the successful discriminative methods, known today as
neural networks (multi-layer perceptrons), support vector machines, and the like.
But discriminative methods may be reaching their limits.
Despite the success of discriminative methods,
generative methods have many advantages - they can generalize better to un-forseen changes
in data make-up, are modular (by class), can make better use of
unlabeled data, and can be easily interrogated - to see what they
have learned about a given class.
In fact, generative methods, which
took a back-seat to the popular discriminative
methods, are now seeing a re-birth,
for example, in a form of Bayesian belief networks
and deep belief networks (DBN) .
These new generative models rival or exceed the performance
of discriminative models.
But generative methods are particularly affected by high dimension.
Classical theory requires a common data space for decisions,
giving rise to the dimensionality curse: the compromise
between PDF estimation error at high dimension,
or insufficient information at low dimension.
This has forced practitioners of generative methods
to seek refuge in a lower-dimensional feature space
and discard the original high-dimensional data,
what can be seen as an admission of defeat!
So how can generative methods be extended to higher dimensions
without throwing out the original data?
The method of PDF projection extends classical
theory so that it does not require the use of a common feature space,
and may utilize the information in multiple feature sets without
suffering an increase in the feature dimension.
Despite the compelling argument of PDF projection,
it suffers from many conceptual and implementational pitfalls,
so has made few inroads into recent literature.
The purpose of this book is to offer researchers a sort of quick-start
quide and ready-to-use software to avoid these pitfalls and speed the way to