The Saddle Point Approximation

As we stated in the last section, the types of transformations and reference hypotheses for which analytic expressions are available are limited. For a broader class of statistics, the exact moment generating function (MGF) is often known. The problem is that inversion of the MGF transformation to find the exact PDF may not be possible in closed form. For these cases, we can use the saddle point approximation (SPA) which provides accurate tail PDF estimates. Steven Kay was the first to recommend the saddle point Approximation as a solution to this problem in CSM. Later, it was futher developed by Nuttall [16]. It should be kept in mind that although the SPA is an approximation, the accuracy is not degraded in the PDF tails. On the contrary, it is extremely well adapted to the task at hand, even for input data very far from the central peak of the reference hypothesis PDF. This is because the SPA approximates the shape of the integrand, not the value (height) of it. There are limits, however and we find that the SPA, which is a recursive search for the saddle point itself, may suffer from convergence problems. This problem is solved using the data normalization, which is a special case of a floating reference hypothesis and is described in section 2.3.4. Further details about the SPA are provided in Appendix 16.6.

Let feature $ {\bf z}\in{\cal R}^D$. The moment-generating function for $ p({\bf z})$ is given by

$\displaystyle g_z($$ \lambda$$\displaystyle ) = {\cal E}\left\{\exp\left(\mbox{\boldmath $\lambda$}^\prime {\bf z}\right)\right\}$

for $ D$-dimensional Laplace transform variable $ \lambda$. Then, $ {\bf z}$ is obtained by the inverse Laplace transform:

$\displaystyle p_z({\bf z}) = \frac{1}{(j2\pi)^M}\int_{C} \exp\left(-\mbox{\bold...
...}\right) \; g_z(\mbox{\boldmath $\lambda$})\;{\rm d}\mbox{\boldmath $\lambda$},$

where $ j=\sqrt{-1}.$ The contour $ C$ is parallel to the imaginary axis in each of the $ M$ dimensions of $ \lambda$. The joint cumulant generating function (CGF) is

$\displaystyle c_z($$ \lambda$$\displaystyle ) = \log g_z($$ \lambda$$\displaystyle ).$

For a specified $ {\bf z}$, the saddle point is that real point $ \hat{\mbox{\boldmath $\lambda$}}({\bf z})$ where all $ D$ partial derivatives satisfy

$\displaystyle \left.\frac{\partial c_z(\mbox{\boldmath $\lambda$})}{\partial \l...
...a_i}\right\vert _{\hat{\mbox{\boldmath $\lambda$}}}=z_i,
\;\;\; 1\leq i \leq D.$

The saddle point may be found iteratively using the recursion

$ \lambda$$\displaystyle _{n+1}=$$ \lambda$$\displaystyle _n + {\bf C}_z^{-1}($$ \lambda$$\displaystyle _n)\; \left({\bf z}-{\bf c}_z^\lambda(\mbox{\boldmath $\lambda$}_n)\right),$

where $ {\bf c}_z^\lambda($$ \lambda$$ )$ is the gradient vector of $ c_z($$ \lambda$$ )$ w/r to $ \lambda$, and $ {\bf C}_z(\lambda)$ is the $ D\times D$ matrix of second partial derivatives

$\displaystyle {\bf C}_z(\lambda) \stackrel{\mbox{\tiny $\Delta$}}{=}\left[ \frac{\partial^2c_z(\lambda)}{\partial \lambda_l \partial \lambda_m}\right].$

Once the saddle point $ \hat{\mbox{\boldmath $\lambda$}}({\bf z})$ is found, the saddle point approximation is given by

$\displaystyle p_z({\bf z}) \simeq \frac{\exp\left\{ c_z(\hat{\mbox{\boldmath$\l...
...;\;\; \hat{\mbox{\boldmath$\lambda$}}=\hat{\mbox{\boldmath$\lambda$}}({\bf z}).$ (2.12)

Example 2   Let $ {\bf x}$ be a set of $ N$ samples of positive-valued data, such as intensity or power spectrum measurements. Let $ {\bf z}={\bf A}^\prime {\bf x}$, where matrix $ {\bf A}$ is $ N\times D$. This is a simple linear operation producing the $ D$-dimensional feature $ {\bf z}$. To compute the J-function for this feature transformation, we need to select a reference hypothesis $ p({\bf x}\vert H_0)$ for which we can compute the feature PDF $ p({\bf z}\vert H_0)$. We are tempted to use the Gaussian PDF since the Gaussian PDF of $ {\bf z}$ would be trivial to derive. Although (2.2) would result in a valid projected PDF, $ G({\bf x};H_0,T,g)$ would have support for negative values of $ {\bf x}$, which would violate our prior knowledge that $ {\bf x}$ was positive. Other reasons why this is a bad choice of $ H_0$ are discussed on chapter 3. A much better choice is the exponential PDF (1/2 times the standard $ \chi^2(2)$ random variable : chi-square distribution with 2 degrees of freedom, scaled by 0.5)

$\displaystyle p({\bf x}\vert H_0)=\prod_i e^{-x_i}.$

Unfortunately, the PDF $ p({\bf z}\vert H_0)$ is not known in closed form. This problem was adressed in 2000 by Steven Kay who suggested the use of the saddle point approximation (SPA) [16]. Although $ p({\bf z}\vert H_0)$ is not known in closed form, the moment generating function (MGF) is known. Is is just a matter of numerical inversion of the MGF. We address this problem in detail in Section 16.6. The function software/pdf_A_chisq_m.m implements the algorithm that computes $ \log p({\bf z}\vert H_0)$.

To test the approach, we generated data $ {\bf x}$ as $ N$ independent samples of from the $ \chi^2(2)$ distribution, for $ N=8$, then used the matrix

$\displaystyle {\bf A}=\left[ \begin{array}{ll} 1 & 0\\
1 & 1\\
1 & 2\\
1 & 3\\
\vdots & \vdots \\
1 & N-1

We compared the histogram of the values of $ {\bf z}$ with the SPA for each histogram bin. Figure 2.2 shows the result.

Figure 2.2: Comparing histogram with PDF approximated using SPA.

Baggenstoss 2017-05-19