Submitted to:Neural Computation
We present methods for coupling hidden Markov models (HMMs) to model systems of multiple interacting processes. The resulting models have multiple state variables that are temporally coupled via matrices of conditional probabilities. We introduce a deterministic O(T(CN)2) approximation for maximum a posterior (MAP) state estimation which enables fast classification and parameter estimation via expectation maximization. An ``N-heads'' dynamic programming algorithm samples from the highest probability paths through a compact state trellis, minimizing an upper bound on the cross entropy with the full (combinatoric) dynamic programming problem. The complexity is O(T(CN)2) for C chains of N states apiece observing T data points, compared with O(TN2C) for naive (Cartesian product), exact (state clustering), and stochastic (Monte Carlo) methods applied to the same inference problem. In several experiments examining training time, model likelihoods, classification accuracy, and robustness to initial conditions, coupled HMMs compared favorably with conventional HMMs and with energy-based approaches to coupled inference chains. We demonstrate and compare these algorithms on synthetic and real data, including interpretation of video.