next up previous contents
Next: Gaussian prior factor for Up: Gaussian prior factor for Previous: Normalization by parameterization: Error   Contents


The Hessians ${\bf H}_L$, ${\bf H}_g$

The Hessian ${\bf H}_L$ of $-E_L$ is defined as the matrix or operator of second derivatives

\begin{displaymath}
{\bf H}_L (L) (x,y; x^\prime y^\prime)
=
\frac{\delta^2 (-E_L)}
{\delta L (x,y)\delta L(x^\prime ,y^\prime )}\Bigg\vert _{L}.
\end{displaymath} (154)

For functional (109) and fixed $\Lambda_X$ we find the Hessian by taking the derivative of the gradient in (127) with respect to $L$ again. This gives
\begin{displaymath}
{\bf H}_L(L) (x,y; x^\prime y^\prime )
=-{{\bf K}}(x,y;x^\pr...
...ta (x-x^\prime )\delta (y-y^\prime ) \Lambda_X (x )e^ {L(x,y)}
\end{displaymath} (155)

or
\begin{displaymath}
{\bf H}_L
=
-{{\bf K}}
- {\bf\Lambda}_X {\bf e^L}.
\end{displaymath} (156)

The addition of the diagonal matrix ${\bf\Lambda}_X {\bf e^L}$ = ${\bf e^L} {\bf\Lambda}_X$ can result in a negative definite ${\bf H}$ even if ${{\bf K}}$ has zero modes, like for a differential operator ${{\bf K}}$ with periodic boundary conditions. Note, however, that ${\bf\Lambda}_X {\bf e^L}$ is diagonal and therefore symmetric, but not necessarily positive definite, because $\Lambda_X (x)$ can be negative for some $x$. Depending on the sign of $\Lambda_X (x)$ the normalization condition $Z_X(x)=1$ for that $x$ can be replaced by the inequality $Z_X(x)\le 1$ or $Z_X(x)\ge 1$. Including the $L$-dependence of $\Lambda_X$ and with
\begin{displaymath}
\frac{\delta e^{L(x^\prime,y^\prime)} }{\delta g(x,y)}
= \de...
...}
-
\delta (x-x^\prime) e^{L(x,y)} e^{L(x^\prime,y^\prime )}
,
\end{displaymath} (157)

i.e.,
\begin{displaymath}
\frac{\delta e^{L} }{\delta g}
= \left( {\bf I} - {\bf e^L}\...
...ight) {\bf e^L}
= {\bf e^L} - {\bf e^L}\, {\bf I}_X {\bf e^L},
\end{displaymath} (158)

or
\begin{displaymath}
\frac{\delta P }{\delta g}
= {\bf P} - {\bf P}\, {\bf I}_X {\bf P},
\end{displaymath} (159)

we find, written in terms of $L$,

\begin{displaymath}
{\bf H}_g (L)(x,y;x^\prime,y^\prime )
= \frac{\delta^2 (-E_g)}
{\delta g (x,y)\delta g(x^\prime ,y^\prime )}\Bigg\vert _{L}
\end{displaymath}


\begin{displaymath}
= \!\! \int \!\! dx^{\prime\prime} dy^{\prime\prime} \!
\lef...
...g (x,y)\delta g (x^\prime,y^\prime) }
\right)\!\Bigg\vert _{L}
\end{displaymath}


  $\textstyle =$ $\displaystyle -{{\bf K}}(x,y;x^\prime,y^\prime )
- e^{L(x^\prime, y^\prime)}e^{...
...\prime\prime}
{{\bf K}}( x^\prime,y^{\prime\prime} ; x,y^{\prime\prime\prime} )$  
    $\displaystyle + e^{L(x^\prime, y^\prime)}
\int \!dy^{\prime\prime} {{\bf K}}(x^...
...(x,y)}
\int \!dy^{\prime\prime} {{\bf K}}(x^\prime,y^\prime;x,y^{\prime\prime})$  
    $\displaystyle -\delta (x-x^\prime) \delta (y-y^\prime) e^{L(x,y)}
\left( N_X(x ) -\int\!dy^{\prime \prime} ({{\bf K}} L)(x,y^{\prime\prime}) \right)$  
    $\displaystyle +
\delta(x- x^\prime)
e^{L(x,y)} e^{L(x^\prime,y^\prime)}
\left( N_X(x)-\int \!dy^{\prime \prime} ({{\bf K}} L)(x,y^{\prime\prime})\right).$ (160)

The last term, diagonal in $X$, has dyadic structure in $Y$, and therefore for fixed $x$ at most one non-zero eigenvalue. In matrix notation the Hessian becomes
$\displaystyle {\bf H}_g$ $\textstyle =$ $\displaystyle - \left( {\bf I} - {\bf e^L} {\bf I}_X \right)
{{\bf K}} \left( {...
...\right)
- \left( {\bf I} - {\bf e^L} {\bf I}_X \right)
{\bf\Lambda}_X {\bf e^L}$  
  $\textstyle =$ $\displaystyle - \left( {\bf I} - {\bf P} {\bf I}_X \right)
\left[
{{\bf K}} \left( {\bf I} - {\bf I}_X {\bf P} \right)
+ {\bf\Lambda}_X {\bf P}
\right]
,$ (161)

the second line written in terms of the probability matrix. The expression is symmetric under $x\leftrightarrow x^\prime$, $y\leftrightarrow y^\prime$, as it must be for a Hessian and as can be verified using the symmetry of ${{\bf K}} = {{\bf K}}^T$ and the fact that ${\bf\Lambda}_X$ and ${\bf I}_X$ commute, i.e., $[{\bf\Lambda}_X , {\bf I}_X] = 0$. Because functional $E_g$ is invariant under a shift transformation, $g(x,y) \rightarrow g^\prime(x,y) + c(x)$, the Hessian has a space of zero modes with the dimension of $X$. Indeed, any $y$-independent function (which can have finite $L^1$-norm only in finite $Y$-spaces) is a left eigenvector of $\left( {\bf I} - {\bf e^L} {\bf I}_X \right)$ with eigenvalue zero. Thus, where necessary, the pseudo inverse of ${\bf H}$ have to be used instead of the inverse. Alternatively, additional constraints on $g$ can be added which remove zero modes, like for example boundary conditions.


next up previous contents
Next: Gaussian prior factor for Up: Gaussian prior factor for Previous: Normalization by parameterization: Error   Contents
Joerg_Lemm 2001-01-21