Conditional Estimation in general

Let the data vector ${\bf z}$ be composed of two parts ${\bf x}$ and ${\bf y}$:

$\displaystyle {\bf z}= \left[\begin{array}{cc} {\bf x} {\bf y}\end{array} \right].
$

We have available training samples of ${\bf z}$, however in the future, only ${\bf y}$ will be available from which we would like to compute estimates of ${\bf x}$. We will shortly see that the GM density facilitates the computation of the conditional mean or minimum mean square error (MMSE) estimator of ${\bf x}$. The conditional mean estimator is the expected value of ${\bf x}$ conditioned on ${\bf y}$ taking a specific (measured) value, i.e.,

$\displaystyle \hat{{\bf x}}={\bf E}({\bf x}\vert{\bf y})
= \int_{\bf x}\; {\bf x}\; p({\bf x}\vert{\bf y}) \; d{\bf x}
$

The maximum aposteriori (MAP) estimator is given by

$\displaystyle \hat{{\bf x}}=\max_{\bf x}p({\bf x}\vert{\bf y}) .
$

Both the MAP and MMSE estimators are operations performed on $p({\bf x}\vert{\bf y})$. Which estimator is most appropriate depends on the problem. Suffice it to say that the distribution $p({\bf x}\vert{\bf y})$ expresses all the knowledge we have about ${\bf x}$ after having measured ${\bf y}$.