Statistical Model

Let's assume there are $ k$ tonals in AR noise of order $ P$. There are $ 3$ parameters for each tonal: frequency, amplitide and phase, or equivalently frequency and complex amplitude with two components (in-phase and quadrature), then the $ P+1$ AR parameters for a grand total of $ 3k+P+1$ parameters. Let the data be written

$\displaystyle x_t = y_t + e_t,$

where $ t$ ranges from $ 1$ to $ N$, $ e_t$ is auto-regressive noise and

$\displaystyle y_t = w_t \left\{ \sum_{i=1}^L \left\{
\alpha_i c_t + \beta_i s_t\right\}\right\}$

is the deterministic (tonal) component, and

$\displaystyle c_{t,i} = \cos\left(2\pi f_i (t-N/2-1)\right),$

$\displaystyle s_{t,i} = \sin\left(2\pi f_i (t-N/2-1)\right),$

and $ w_t$ are the time-series window function coefficients (Hanning, etc.).

In a more compact matrix notation, we let

$\displaystyle {\bf H} =
\left[ \begin{array}{llllllll}
c_{1,1} & c_{1,2} & \cdo...
...} \\
\beta_{1} \\
\beta_{2} \\
\vdots \\
\beta_{L} \\
\end{array} \right],$

$\displaystyle {\bf y}= {\bf W} {\bf H}{\bf b},$

where $ {\bf W}$ is the diagonal matrix of window function coefficients.

The log-likelihood function is given by

$\displaystyle \log p({\bf x}) = -\frac{N}{2}\log(2\pi) -\frac{1}{2}\log\det({\bf R}) -\frac{1}{2} ({\bf x}-{\bf y})^\prime {\bf R}^{-1} ({\bf x}-{\bf y}),$ (8.11)

where $ {\bf R}$ is the auto-correlation matrix for an AR process (see Section 9.1.3).

We break down $ {\bf R}^{-1}$ into its well-known Cholesky factor form

$\displaystyle {\bf R}^{-1} = \frac{1}{\sigma^2} {\bf A}^\prime {\bf A},$

where $ {\bf A}$ is the prewhitening matrix, which is a banded matrix formed from the prediction error filter. Each row of $ {\bf A}$ is formed from the prediction error filter $ {\bf a}=[0, 0, \ldots, 0, a_P, \ldots, a_2, a_1, 1,
0, \ldots 0]$, where the $ 1$ occupies the main diagonal. Matrix $ {\bf A}$ has a determinant of 1, as it is lower-triangular and has ones on the diagonal. For the non-circular (stationary process) case, on the rows less than row $ P+1$, the prediction error filter is truncated, and so the coefficients must be estimated from the corrsponding lower-order AR model. We can avoid the complexities of these end effects, especially since we are assuming that a window function is applied to the data, by assuming a circular AR process (See sections 9.1.2, 9.2.4). Then, matrix $ {\bf A}$ is circulant, with each row containing the full prediction error filter, wrapped around. Furthermore, the equation $ {\rm det}({\bf R})=(\sigma^2)^N$ holds exactly. We can therefore re-write (8.11) as:

$\displaystyle \log p({\bf x}) = -\frac{N}{2}\log(2\pi\sigma^2) -\frac{1}{2\sigma^2} ({\bf x}-{\bf y})^\prime {\bf A}^\prime {\bf A} ({\bf x}-{\bf y}).$ (8.12)

The frequency-domain equivalent of (8.12) is

$\displaystyle \log p({\bf x}) = - \frac{1}{2} \sum_{k=0}^{N-1} \; \left\{ \log ...
...(2\pi \sigma^2 \right) + \frac{\vert(X_k-Y_k)A_k\vert^2}{N \sigma^2 } \right\},$ (8.13)

where $ X_k$ and $ Y_k$ are the Fourier coefficients of $ {\bf x}$ and $ {\bf y}$, respectively. In obtaining (8.13), we used the fact that

$\displaystyle \sum_{k=0}^{N-1} \log \vert A_k\vert^2=1.$

This is just the frequency-domain equivalent of saying that the determinant of the circulant matric $ {\bf A}$, with all ones on the diagonal, is 1. Although written in terms of Fourier coefficients, the expression is a PDF of $ {\bf x}$. The equivalence of (8.13) and (8.12) can be readily seen.

Baggenstoss 2017-05-19