We describe a computer vision system for observing facial motion by using an optimal estimation optical flow method coupled with a geometric and a physical (muscle) model describing the facial structure . Our method produces a reliable parametric representation of the face's independent muscle action groups, as well as an accurate estimate of facial motion.
Previous efforts at analysis of facial expression have been based on the Facial Action Coding System (FACS), a representation developed in order to allow human psychologists to code expression from static pictures. To avoid use of this heuristic coding scheme, we have used our computer vision system to probabilistically characterize facial motion and muscle activation in an experimental population, thus deriving a new, more accurate representation of human facial expressions that we call FACS+
We use this new representation for recognition in two different ways. The first method uses the physics-based model directly, by recognizing expressions through comparison of estimated muscle activations. The second method uses the physics-based model to generate spatio-temporal motion-energy templates of the whole face for each different expression. These simple, biologically-plausible motion energy ``templates'' are then used for recognition. Both methods show substantially greater accuracy at expression recognition than has been previously achieved.