Tony S. Jebara and Alex Pentland
Media Laboratory, Massachusetts Institute of Technology
Cambridge, MA 02139
November 28th, 1996
A real-time system is described for automatically detecting, modeling and tracking faces in 3D. A closed loop approach is proposed which utilizes structure from motion to generate a 3D model of a face and then feed back the estimated structure to constrain feature tracking in the next frame. The system initializes by using skin classification, symmetry operations, 3D warping and eigenfaces to find a face. Feature trajectories are then computed by SSD or correlation-based tracking. The trajectories are simultaneously processed by an extended Kalman filter to stably recover 3D structure, camera geometry and facial pose. Adaptively weighted estimation is used in this filter by modeling the noise characteristics of the 2D image patch tracking technique. In addition, the structural estimate is constrained by using parametrized models of facial structure (eigen-heads). The Kalman filter's estimate of the 3D state and motion of the face predicts the trajectory of the features which constrains the search space for the next frame in the video sequence. The feature tracking and Kalman filtering closed loop system operates at 30Hz.