Gaussian mixture regression (cluster regression)

Next: Support vector machines and Up: Regression Previous: Exact predictive density Contents

Gaussian mixture regression (cluster regression)

Generalizing Gaussian regression the likelihoods may be modeled by a mixture of Gaussians

$\begin{displaymath} p(y\vert x,{h}) = \frac{\sum_k^m p(k)\, e^{-\frac{\beta}{2}... ...int \!dy\,\sum_k^m p(k)\, e^{-\frac{\beta}{2} (y-h_k(x))^2}} , \end{displaymath}$

(331)

where the normalization factor is found as $\sum_k p(k) \left(\frac{\beta}{2\pi}\right)^{\frac{m}{2}}$ . Hence,

is here specified by mixing coefficients

and a vector of regression functions

specifying the

-dependent location of the

th cluster centroid of the mixture model. A simple prior for

is a smoothness prior diagonal in the cluster components. As any density $p(y\vert x,h)$ can be approximated arbitrarily well by a mixture with large enough

such cluster regression models allows to interpolate between Gaussian regression and more flexible density estimation.

The posterior density becomes for independent data

$\begin{displaymath} p(h\vert D,D_0) = \frac{p(h\vert D_0)}{p(y_D\vert x_D,D_0)} ... ...um_k^m p(k)\, \left(\frac{\beta}{2\pi}\right)^{\frac{m}{2}}} . \end{displaymath}$

(332)

Maximizing that posterior is -- for fixed

, uniform

and $p(h\vert D_0)$ -- equivalent to the clustering approach of Rose, Gurewitz, and Fox for squared distance costs [203].

Next: Support vector machines and Up: Regression Previous: Exact predictive density Contents

Joerg_Lemm 2001-01-21