Maximum Likelihood Parameter Estimation and the CR bound

Consider a general log PDF that depends on $P$ parameters: $\theta$$= [\theta_1, \theta_2 \ldots \theta_P]^\prime$:

$\displaystyle p({\bf x}) = p({\bf x};$   $\displaystyle \mbox{\boldmath$\theta$}$$\displaystyle ).$

The Fisher's information between any two parameters $\theta_1$ and $\theta_2$ is defined by

$\displaystyle I_{\theta_2, \theta_2} = - {\rm E}\left\{
\frac{\partial^2 \log p...
...f x};\mbox{\boldmath$\theta$}) }{\partial \theta_1 \partial \theta_2}
\right\}.$ (17.13)

Collecting all these values into the matrix ${\bf I}($$\theta$$)$, we have the Fisher's information matrix. The Cramer-Rao lower bound states that the covariance matrix ${\bf C}$ of any joint unbiased estimator for the parameters $\theta$ is such that

$\displaystyle {\rm det}\left\{{\bf C}-{\bf I}^{-1}(\mbox{\boldmath $\theta$})\right\} > 0$

. This effectively means that ${\bf I}^{-1}($$\theta$$)$ is the lower bound for the covariance of any unbiased estimator.

The inverse of the Fisher's information matrix is a good estimate of the parameter estimation error covariance and is useful for iterative optimization. Given a parameter estimate $\theta$$_n$, the new estimate is obtained as

$\displaystyle \mbox{\boldmath$\theta$}$$\displaystyle _{n+1}=$   $\displaystyle \mbox{\boldmath$\theta$}$$\displaystyle _n + I^{-1}($$\displaystyle \mbox{\boldmath$\theta$}$$\displaystyle _n) \;$   $\displaystyle \mbox{\boldmath$\delta$}$$\displaystyle ,$ (17.14)

where

$\delta$$\displaystyle = \left[ D(\theta_1) \; D(\theta_2) \ldots \right]^\prime$

is the gradient vector formed from the first partial derivatives

$\displaystyle D(\theta)\stackrel{\mbox{\tiny $\Delta$}}{=}\frac{\partial}{\partial \theta} \log p({\bf x}; \theta).$

It is possible to optimize only subsets of the features as well. A feature pair $\theta_1, \theta_2$ is updated according to

$\displaystyle \left[ \begin{array}{l} \theta_1  \theta_2 \end{array}\right]_{...
...^{-1}
\; \left[ \begin{array}{l} D(\theta_1)  D(\theta_2) \end{array}\right].$ (17.15)