The problem we address is extremely broad, and encompasses many related fields in statistics. We consider data $ {\bf x}$ to be a raw (unprocessed) data sample of high dimension. By raw and unprocessed, we mean that there has not yet been an intentional dimension reduction or filtering in which useful information could have been lost. For example, $ {\bf x}$ could be a time-series (sampled recording of acoustic or other sensor data), an image, or other type of high-dimensional measurement. Assume there are one or more competing statistical hypotheses concerning $ {\bf x}$ , denoted by the hypothesis index $ H$, which can take discrete values (fixed hypotheses) such as $ H\in\{H_1, H_2, \ldots\}$, or can take on continuous values (parameters). Our goal is to create improved generative models, written $ p({\bf x}\vert H)$. These generative models can be used for numerous purposes including inference (creating generative classifiers), combination with discriminative to form hybrid discriminative/generative classifiers, and for sampling methods. By ``sampling methods", we mean applications where random samples of $ p({\bf x}\vert H)$ are required. This could be for simulations, or for Monte Carlo methods, which are stochastic methods to approximate integrals.

Baggenstoss 2017-05-19