next up previous
Next: Conclusions Up: Quadratic Concepts Previous: Quadratic concepts

Combination of concepts

Classical regularization functionals consist of a sum of quadratic concepts. In a probabilistic interpretation this corresponds to a combination by AND. Typically, for example, a training data term AND a prior term is approximated by $h$. The sum of quadratic concepts, however, is again a quadratic concept. Analogously, a product of Gaussians is Gaussian. Straightforward calculation shows that a sum of squared distances $d_i^2$ with concept operators $K_i$ can be written $E(h)=\sum_i^N E_i(h) =\sum_i^N d_{i}^2(h)/2 = d^2 (h)/2+V
$, with squared distance $d^2(h) = \mbox{$
\left\langle h\!-\!\overline{t}\left\vert K \rule[\tiefe]{0cm}{\hoehe}
\right\vert h\!-\!\overline{t}\right\rangle $} ,
$ template average $\overline{t} = K^{-1} \tilde t
$, $\tilde t = \sum_{i}^N K_i t_i
$, $K = \sum_{i} K_i
$, and $h$-independent minimal component energy $V
= \left(\sum_{i}^N \mbox{$
\left\langle t_{i}\left\vert K_i\rule[\tiefe]{0cm...
...K\rule[\tiefe]{0cm}{\hoehe}
\right\vert\overline{t}\right\rangle $} \right)/2
$, which has the structure of a variance up to a factor $2N$. The linear stationarity equation for a functional $E= d^2(h)/2 +V$ reads $0 = K(\overline{t} - h)
= \tilde t-Kh
$. For positive definite, i.e., invertible $K$, this has solution $h=\overline{t}=K^{-1}\tilde t$ which can be solved in a space with dimension smaller or equal to $n$ [7,3] . Now let us present types of non-convex error functionals.


Example: Consider an image reconstruction task where we expect the image of a face. Thus, we may choose concepts with partial template functions for eyes, nose and mouth and require the reconstructed image to approximate the given pixel data AND the eye, nose and mouth templates. Typically, however, the constituents of a face can appear in many different variations. Eyes may be open OR closed, blue OR brown but also translated, scaled or otherwise deformed. Such OR-like combinations of alternative concepts are examples of non-convex prior knowledge.

In a probabilistic interpretation of alternative concepts representing disjunct events indexed by $i$, this yields the mixture model

\begin{displaymath}
E_M(h)
=-\ln
\sum_i^{N} p(D,i\vert h) - c
=-\ln \left(
%%...
...\left( \sum_i^{N_i} p(i)p(D\vert h,i) \right) + {\rm const.}
,
\end{displaymath} (2)

with component energies $E_i(h)$ = $d^2_{i}(h)/2 = \mbox{$
\left\langle h-t_{i}\left\vert K_{i}\rule[\tiefe]{0cm}{\hoehe}
\right\vert h-t_{i}\right\rangle $} /2$, arbitrary constant $c$ and $p(D,i\vert h) = p(i) e^{-\beta (E_i(h)- F_i)}$. If $i$-dependent, the normalization integrals $\beta F_i = - \ln \left( \int dt_{i} e^{-\beta\, E_{i}} \right)$ have to be calculated so they do not interfere with the mixture probabilities $p(i)$. In that case the model has the structure of a disordered (``spin-glass-like'') system. The model has the stationarity condition $0 %%= <t_i(h) - K_i(h) h>_M
= t_M(h) - K_M (h) h,
$ with $K_M = \sum_i p(D,i\vert h) K_i
$ and $t_M = \sum_i p(D,i\vert h) K_i t_i
%%= \sum_i p(D,i\vert h) \tilde t_i
$. The parameter $\beta$ is known as inverse temperature and interpolates between a convex AND at high temperature and a non-convex OR at low temperature.

Products are another possibility to implement OR-like structures leading to technically convenient polynomial models

\begin{displaymath}
E_{LG}(h) = \frac{1}{2}
\prod_{i=1}^{N}
d_{i}^2 (h).
\end{displaymath} (3)

The stationarity equation is $K_{LG} (h) h = t_{LG}(h)$ where $K_{LG}(h) = \sum_iM_{i}(h) K_{i}
$ and ${t}_{LG}(h) = \sum_i M_{i}(h) K_{i} t_{i}$ with $M_{i} (h) = %%\beta_{LG}^{m-1}
\prod_{k\ne i} d_{k}^2(h)
$. The model resembles the Landau-Ginzburg treatment of phase transitions in statistical mechanics. Numerical studies of solvable mixture and polynomial models have been performed and will be reported elsewhere.


next up previous
Next: Conclusions Up: Quadratic Concepts Previous: Quadratic concepts
Joerg_Lemm 2000-09-22