The EM algorithm can be extended by substituting Jensen's inequality for a different bound. Consider the upper variational bound of a logarithm (which becomes a lower bound on the negative log). The proposed logarithm's bound satisfies a number of desiderata: (1) it makes contact at the current operating point1, (2) it is tangential to the logarithm, (3) it is a tight bound, (4) it is simple and (5) it is the variational dual of the logarithm. Substituting this linear bound into the incremental conditional log-likelihood maintains a true lower bounding function Q (Equation 6).
The Mixture of Experts formalism [4] offers a graceful representation of a conditional density using experts (conditional sub-models) and gates (marginal sub-models). The Q function adopts this form in Equation 7.
Computing this Q function forms the CE-step in the Conditional Expectation Maximization algorithm and it results in a simplified M-step. Note the absence of the logarithm of a sum and the decoupled models. The form here allows a more straightforward computation of derivatives with respect to and a more tractable M-Step. For continuous missing data, a similar derivation holds.
At this point, without loss of generality, we specifically attend to the case of a conditioned Gaussian mixture model and derive the corresponding M-Step calculations. This serves as an implementation example for comparison purposes.