A prior mean or template function
represents a prototype, reference function or base line
for
.
It may be a typical expected pattern in time series prediction
or a reference image in image reconstruction.
Consider, for example, the task of completing an image
given some pixel values (training data)
[137].
Expecting the image to be that of a face the
template function
may be chosen
to be some prototypical image of a face.
We have seen in Section 3.5
that a single template
could be eliminated
for Gaussian (specific) priors
by solving for
instead for
.
Restricting, however, to only a single template may be a very bad choice.
Indeed, faces for example appear on images in many variations,
like
in different scales, translated, rotated, various illuminations,
and other kinds of deformations.
We may now describe such variations
by a family of templates
,
the parameter
describing scaling, translations,
rotations, and more general deformations.
Thus,
we expect a function to be similar to
only one of the templates
and want to implement a (soft, probabilistic) OR,
approximating
OR
OR
(See also [133,134,135,136]).
A (soft, probabilistic) AND of approximation conditions, on the other hand, is implemented by adding error terms. For example, classical error functionals where data and prior terms are added correspond to an approximation of training data AND a priori data.
Similar considerations apply for model selection.
We could for example expect to be well approximated by a
neural network or a decision tree.
In that case
spans, for example, a space of neural networks
or decision trees.
Finally, let us emphasize again that
the great advantage and practical feasibility of adaptive templates
for regression problems
comes from the fact that no additional normalization terms have to be
added to the error functional.