Next: Inverse quantum mechanics Up: Gaussian prior factors Previous: Support vector machines and   Contents

## Classification

In classification (or pattern recognition) tasks the independent visible variable takes discrete values (group, cluster or pattern labels) [16,61,24,47]. We write = and = , i.e., = . Having received classification data = the density estimation error functional for a prior on function (with components and = ) reads

 (335)

In classification the scalar product corresponds to an integral over and a summation over , e.g.,
 (336)

and = .

For zero-one loss = -- a typical loss function for classification problems -- the optimal decision (or Bayes classifier) is given by the mode of the predictive density (see Section 2.2.2), i.e.,

 (337)

In saddle point approximation where minimizing can be found by solving the stationarity equation (228).

For the choice non-negativity and normalization must be ensured. For with non-negativity is automatically fulfilled but the Lagrange multiplier must be included to ensure normalization.

Normalization is guaranteed by using unnormalized probabilities , (for which non-negativity has to be checked) or shifted log-likelihoods with , i.e., = . In that case the nonlocal normalization terms are part of the likelihood and no Lagrange multiplier has to be used [236]. The resulting equation can be solved in the space defined by the -data (see Eq. (153)). The restriction of = to linear functions yields log-linear models [154]. Recently a mean field theory for Gaussian Process classification has been developed [177,179].

Table 3 lists some special cases of density estimation. The last line of the table, referring to inverse quantum mechanics, will be discussed in the next section.

Table 3: Special cases of density estimation
 likelihood problem type of general form density estimation discrete classification Gaussian with fixed variance regression mixture of Gaussians clustering quantum mechanical likelihood inverse quantum mechanics

Next: Inverse quantum mechanics Up: Gaussian prior factors Previous: Support vector machines and   Contents
Joerg_Lemm 2001-01-21