next up previous
Next: CEM and Bound Maximization Up: Maximum Conditional Likelihood via Previous: EM and Conditional Likelihood

Conditional Expectation Maximization

The EM algorithm can be extended by substituting Jensen's inequality for a different bound. Consider the upper variational bound of a logarithm $x-1 \geq \log(x)$ (which becomes a lower bound on the negative log). The proposed logarithm's bound satisfies a number of desiderata: (1) it makes contact at the current operating point1, (2) it is tangential to the logarithm, (3) it is a tight bound, (4) it is simple and (5) it is the variational dual of the logarithm. Substituting this linear bound into the incremental conditional log-likelihood maintains a true lower bounding function Q (Equation 6).



 
$\displaystyle \Delta l^c \geq Q(\Theta^t,\Theta^{t-1}) = \sum_{i=1}^N \sum_{m=1...
...\bf x}_i \vert \Theta^t)} {\sum_{n=1}^M p(n, {\bf x}_i \vert
\Theta^{t-1})} + 1$     (6)


The Mixture of Experts formalism [4] offers a graceful representation of a conditional density using experts (conditional sub-models) and gates (marginal sub-models). The Q function adopts this form in Equation 7.



 
$\displaystyle \begin{array}{c}
\sum_{i=1}^N \sum_{m=1}^M \left \{
{ h}_{im} ( \...
...~ r_{i} = (~{\sum_{n=1}^M p(n, {\bf x}_i
\vert\Theta^{t-1})}~)^{-1}
\end{array}$     (7)


Computing this Q function forms the CE-step in the Conditional Expectation Maximization algorithm and it results in a simplified M-step. Note the absence of the logarithm of a sum and the decoupled models. The form here allows a more straightforward computation of derivatives with respect to $\Theta^{t}$ and a more tractable M-Step. For continuous missing data, a similar derivation holds.

At this point, without loss of generality, we specifically attend to the case of a conditioned Gaussian mixture model and derive the corresponding M-Step calculations. This serves as an implementation example for comparison purposes.


next up previous
Next: CEM and Bound Maximization Up: Maximum Conditional Likelihood via Previous: EM and Conditional Likelihood
Tony Jebara
2000-03-20