Maximum Likelihood Parameter Estimation and the CR bound

Consider a general log PDF that depends on parameters: $\theta$ $= [\theta_1, \theta_2 \ldots \theta_P]^\prime$ :

$\displaystyle p({\bf x}) = p({\bf x};$ $\displaystyle \mbox{\boldmath$\theta$}$ $\displaystyle ).$

The Fisher's information between any two parameters $\theta_1$ and $\theta_2$ is defined by

$\displaystyle I_{\theta_2, \theta_2} = - {\rm E}\left\{ \frac{\partial^2 \log p... ...f x};\mbox{\boldmath$\theta$}) }{\partial \theta_1 \partial \theta_2} \right\}.$

(17.13)

Collecting all these values into the matrix ${\bf I}($ $\theta$ , we have the Fisher's information matrix. The Cramer-Rao lower bound states that the covariance matrix ${\bf C}$ of any joint unbiased estimator for the parameters $\theta$ is such that

$\displaystyle {\rm det}\left\{{\bf C}-{\bf I}^{-1}(\mbox{\boldmath $\theta$})\right\} > 0$

. This effectively means that ${\bf I}^{-1}($ $\theta$ is the lower bound for the covariance of any unbiased estimator.

The inverse of the Fisher's information matrix is a good estimate of the parameter estimation error covariance and is useful for iterative optimization. Given a parameter estimate $\theta$ , the new estimate is obtained as

$\displaystyle \mbox{\boldmath$\theta$}$ $\displaystyle _{n+1}=$ $\displaystyle \mbox{\boldmath$\theta$}$ $\displaystyle _n + I^{-1}($ $\displaystyle \mbox{\boldmath$\theta$}$ $\displaystyle _n) \;$ $\displaystyle \mbox{\boldmath$\delta$}$ $\displaystyle ,$

(17.14)

where

$\delta$ $\displaystyle = \left[ D(\theta_1) \; D(\theta_2) \ldots \right]^\prime$

is the gradient vector formed from the first partial derivatives

$\displaystyle D(\theta)\stackrel{\mbox{\tiny $\Delta$}}{=}\frac{\partial}{\partial \theta} \log p({\bf x}; \theta).$

It is possible to optimize only subsets of the features as well. A feature pair $\theta_1, \theta_2$ is updated according to

$\displaystyle \left[ \begin{array}{l} \theta_1 \theta_2 \end{array}\right]_{... ...^{-1} \; \left[ \begin{array}{l} D(\theta_1) D(\theta_2) \end{array}\right].$

(17.15)