next up previous contents
Next: Unrestricted variation Up: Adapting prior means Previous: General considerations   Contents

Density estimation

The general case with adaptive means for Gaussian prior factors and hyperparameter energy $E_\theta$ yields an error functional

\begin{displaymath}
E_{\theta,\phi} =
-(\ln P(\phi),\,N)
+\frac{1}{2} \Big(\phi-...
...}\,(\phi-t(\theta))\Big)
+ (P(\phi),\, \Lambda_X )+ E_\theta
.
\end{displaymath} (432)

Defining
\begin{displaymath}
{\bf t}^\prime (l;x,y)
= \frac{\partial t(x,y;\theta)}{\partial \theta_l}
,
\end{displaymath} (433)

the stationarity equations of (432) obtained from the functional derivatives with respect to $\phi $ and hyperparameters $\theta$ become
$\displaystyle {{\bf K}}(\phi-t)$ $\textstyle =$ $\displaystyle {\bf P}^\prime(\phi) {\bf P}^{-1}(\phi) N
- {\bf P}^\prime (\phi) \Lambda_X
,$ (434)
$\displaystyle {\bf t}^\prime {{\bf K}} (\phi-t)$ $\textstyle =$ $\displaystyle E_{\theta}^\prime
.$ (435)

Inserting Eq. (434) in Eq. (435) gives
\begin{displaymath}
{\bf t}^\prime{\bf P}^\prime (\phi) {\bf P}^{-1}(\phi) N
= ...
... t}^\prime{\bf P}^\prime(\phi) \Lambda_X
+E_{\theta}^\prime
.
\end{displaymath} (436)

Eq.(436) becomes equivalent to the parametric stationarity equation (356) with vanishing prior term in the deterministic limit of vanishing prior covariances ${\bf K}^{-1}$, i.e., under the assumption $\phi=t(\theta)$, and for vanishing $E_\theta^\prime$. Furthermore, a non-vanishing prior term in (356) can be identified with the term $E_\theta$. This shows, that parametric methods can be considered as deterministic limits of (prior mean) hyperparameter approaches. In particular, a parametric solution can thus serve as reference template $t$, to be used within a specific prior factor. Similarly, such a parametric solution is a natural initial guess for a nonparametric $\phi $ when solving a stationarity equation by iteration.

If working with parameterized $\phi(\xi)$ extra prior terms Gaussian in some function $\psi(\xi)$ can be included as discussed in Section 4.2. Then, analogously to templates $t$ for $\phi $, also parameter templates $t_\psi$ can be made adaptive with hyperparameters $\theta_\psi$. Furthermore, prior terms $E_\theta$ and $E_{\theta_\psi}$ for the hyperparameters $\theta$, $\theta_\psi$ can be added. Including such additional error terms yields

$\displaystyle E_{\theta,\theta_\psi,\phi(\xi),\psi(\xi)}$ $\textstyle =$ $\displaystyle -(\ln P(\,\phi(\xi)\,),\,N) + (P(\,\phi(\xi)\,),\, \Lambda_X )$  
    $\displaystyle +\frac{1}{2} \Big(\phi(\xi)-t(\theta) ,\,{{\bf K}}\,(\phi(\xi)-t(\theta))\Big)$  
    $\displaystyle +\frac{1}{2}
\Big(\psi(\xi)-t_\psi(\theta_\psi),
\,{{\bf K}}_\psi\,(\psi(\xi)-t_\psi(\theta_\psi))\Big)$  
    $\displaystyle +E_{\theta} + E_{\theta_\psi}
,$ (437)

and Eqs.(434) and (434) change to
$\displaystyle \Phi^\prime {{\bf K}}(\phi-t)
+\Psi^\prime {{\bf K}}_\psi (\psi-t_\psi)$ $\textstyle =$ $\displaystyle {\bf P}_\xi^\prime {\bf P}^{-1}N
- {\bf P}_\xi^\prime \Lambda_X
,$ (438)
$\displaystyle {\bf t}^\prime {{\bf K}} (\phi-t)$ $\textstyle =$ $\displaystyle E_{\theta}^\prime
,$ (439)
$\displaystyle {\bf t}_\psi^\prime {{\bf K}}_\psi (\psi-t_\psi)$ $\textstyle =$ $\displaystyle E_{\theta_\psi}^\prime,$ (440)

where ${\bf t}_\psi^\prime $, $E_{\theta_\psi}^\prime$, $E_{\theta}^\prime$ , denote derivatives with respect to the parameters $\theta_\psi$ or $\theta$, respectively. Parameterizing $E_{\theta}$ and $E_{\theta_\psi}$ the process of introducing hyperparameters can be iterated.


next up previous contents
Next: Unrestricted variation Up: Adapting prior means Previous: General considerations   Contents
Joerg_Lemm 2001-01-21