The EM algorithm can be extended by substituting Jensen's inequality
for a different bound. Consider the upper variational bound of a
logarithm
(which becomes a lower bound on the
negative log). The proposed logarithm's bound satisfies a number of
desiderata: (1) it makes contact at the current operating
point1, (2) it is tangential to the logarithm, (3) it is a
tight bound, (4) it is simple and (5) it is the variational dual of
the logarithm. Substituting this linear bound into the incremental
conditional log-likelihood maintains a true lower bounding function
Q (Equation 6).
The Mixture of Experts formalism [4] offers a graceful representation of a conditional density using experts (conditional sub-models) and gates (marginal sub-models). The Q function adopts this form in Equation 7.
Computing this Q function forms the CE-step in the Conditional
Expectation Maximization algorithm and it results in a simplified
M-step. Note the absence of the logarithm of a sum and the decoupled
models. The form here allows a more straightforward computation of
derivatives with respect to
and a more tractable
M-Step. For continuous missing data, a similar derivation holds.
At this point, without loss of generality, we specifically attend to the case of a conditioned Gaussian mixture model and derive the corresponding M-Step calculations. This serves as an implementation example for comparison purposes.