Purpose of this book

There can be no doubt that the need is increasing for better, more accurate, and more meaniful statistical models for high-dimensional data. From machine learning to data mining, to big data analysis, data dimension is increasing. Classical decision theory was unable to cope with the increasing dimension because it required knowing or estimating the likelihood functions for each class assumption. Learning theory [1] argues that classical decision theory needlessly solves the more general problem of PDF estimation, so it is better to estimate the class posterior probabilities directly. This idea gave rise to the successful discriminative methods, known today as neural networks (multi-layer perceptrons), support vector machines, and the like. But discriminative methods may be reaching their limits. Despite the success of discriminative methods, generative methods have many advantages - they can generalize better to un-forseen changes in data make-up, are modular (by class), can make better use of unlabeled data, and can be easily interrogated - to see what they have learned about a given class. In fact, generative methods, which took a back-seat to the popular discriminative methods, are now seeing a re-birth, for example, in a form of Bayesian belief networks and deep belief networks (DBN) [2]. These new generative models rival or exceed the performance of discriminative models. But generative methods are particularly affected by high dimension. Classical theory requires a common data space for decisions, giving rise to the dimensionality curse: the compromise between PDF estimation error at high dimension, or insufficient information at low dimension. This has forced practitioners of generative methods to seek refuge in a lower-dimensional feature space and discard the original high-dimensional data, what can be seen as an admission of defeat! So how can generative methods be extended to higher dimensions without throwing out the original data?

The method of PDF projection extends classical theory so that it does not require the use of a common feature space, and may utilize the information in multiple feature sets without suffering an increase in the feature dimension. Despite the compelling argument of PDF projection, it suffers from many conceptual and implementational pitfalls, so has made few inroads into recent literature. The purpose of this book is to offer researchers a sort of quick-start quide and ready-to-use software to avoid these pitfalls and speed the way to succesful application.

Baggenstoss 2017-05-19