Statistical Model

Let's assume there are $k$ tonals in AR noise of order $P$. There are $3$ parameters for each tonal: frequency, amplitide and phase, or equivalently frequency and complex amplitude with two components (in-phase and quadrature), then the $P+1$ AR parameters for a grand total of $3k+P+1$ parameters. Let the data be written

$\displaystyle x_t = y_t + e_t,$

where $t$ ranges from $1$ to $N$, $e_t$ is auto-regressive noise and

$\displaystyle y_t = w_t \left\{ \sum_{i=1}^L \left\{
\alpha_i c_t + \beta_i s_t\right\}\right\}$

is the deterministic (tonal) component, and

$\displaystyle c_{t,i} = \cos\left(2\pi f_i (t-N/2-1)\right),$

$\displaystyle s_{t,i} = \sin\left(2\pi f_i (t-N/2-1)\right),$

and $w_t$ are the time-series window function coefficients (Hanning, etc.).

In a more compact matrix notation, we let

$\displaystyle {\bf H} =
\left[ \begin{array}{llllllll}
c_{1,1} & c_{1,2} & \cdo...
...} \\
\beta_{1} \\
\beta_{2} \\
\vdots \\
\beta_{L} \\
\end{array} \right],$

$\displaystyle {\bf y}= {\bf W} {\bf H}{\bf b},$

where ${\bf W}$ is the diagonal matrix of window function coefficients.

The log-likelihood function is given by

$\displaystyle \log p({\bf x}) =
-\frac{N}{2}\log(2\pi)
-\frac{1}{2}\log\det({\bf R})
-\frac{1}{2} ({\bf x}-{\bf y})^\prime {\bf R}^{-1} ({\bf x}-{\bf y}),$ (9.11)

where ${\bf R}$ is the auto-correlation matrix for an AR process (see Section 10.1.3).

We break down ${\bf R}^{-1}$ into its well-known Cholesky factor form

$\displaystyle {\bf R}^{-1} = \frac{1}{\sigma^2} {\bf A}^\prime {\bf A},$

where ${\bf A}$ is the prewhitening matrix, which is a banded matrix formed from the prediction error filter. Each row of ${\bf A}$ is formed from the prediction error filter ${\bf a}=[0, 0, \ldots, 0, a_P, \ldots, a_2, a_1, 1,
0, \ldots 0]$, where the $1$ occupies the main diagonal. Matrix ${\bf A}$ has a determinant of 1, as it is lower-triangular and has ones on the diagonal. For the non-circular (stationary process) case, on the rows less than row $P+1$, the prediction error filter is truncated, and so the coefficients must be estimated from the corrsponding lower-order AR model. We can avoid the complexities of these end effects, especially since we are assuming that a window function is applied to the data, by assuming a circular AR process (See sections 10.1.2, 10.2.4). Then, matrix ${\bf A}$ is circulant, with each row containing the full prediction error filter, wrapped around. Furthermore, the equation ${\rm det}({\bf R})=(\sigma^2)^N$ holds exactly. We can therefore re-write (9.11) as:

$\displaystyle \log p({\bf x}) =
-\frac{N}{2}\log(2\pi\sigma^2)
-\frac{1}{2\sigma^2} ({\bf x}-{\bf y})^\prime {\bf A}^\prime
{\bf A} ({\bf x}-{\bf y}).$ (9.12)

The frequency-domain equivalent of (9.12) is

$\displaystyle \log p({\bf x}) = - \frac{1}{2} \sum_{k=0}^{N-1} \;
\left\{ \log ...
...(2\pi \sigma^2 \right) +
\frac{\vert(X_k-Y_k)A_k\vert^2}{N \sigma^2 } \right\},$ (9.13)

where $X_k$ and $Y_k$ are the Fourier coefficients of ${\bf x}$ and ${\bf y}$, respectively. In obtaining (9.13), we used the fact that

$\displaystyle \sum_{k=0}^{N-1} \log \vert A_k\vert^2=1.$

This is just the frequency-domain equivalent of saying that the determinant of the circulant matric ${\bf A}$, with all ones on the diagonal, is 1. Although written in terms of Fourier coefficients, the expression is a PDF of ${\bf x}$. The equivalence of (9.13) and (9.12) can be readily seen.



Subsections