In recent years there has been much interest in tracking the human body using 3-D models with kinematic and dynamic constraints. Perhaps the first efforts at body tracking were by Badler and O'Rourke 1980, followed by Hogg 1988 [11,10]. These early efforts used edge information to drive a kinematic model of the human body. More recently, several authors have applied variations on this basic method to the body tracking problem [18,2]. Because of the highly non-linear nature of the problem, all of these systems require fairly precise initialization, and can not handle the full range of common body motion.
Following this early work using kinematic models, some researchers began using dynamic constraints to track the human body. Pentland and Horowitz 1991 employed non-rigid finite element models driven by optical flow [12], and Metaxas and Terzopolous's 1993 system employing deformable superquadrics [7,9] driven by 3-D point and 2-D edge measurements. Again, these systems required precise initialization and could handle a limited range of body motion.
More recently, Gavrila and Davis [5] and Rehg and Kanade [17], have demonstrated that an analysis-synthesis approach, using kinematic models driven by edge data, has the potential to deal with limited occlusions, and thus to handle a greater range of body motions.
The work described in this paper attempts to combine the the dynamic modeling work with the advantages of an analysis-synthesis approach, by use of an extended Kalman filter formulation that couples a fully dynamic skeletal model with observations of raw pixel values, as modeled by probabilistic `blob' models.
This system also attempts to explicitly incorporate learned patterns of control into the body model. The approach we take is based on the behavior modeling framework introduced in Pentland and Liu 1995 [14]; it is also related to the behavior modeling work of Blake 1996 [6] and Bregler 1997 [3]. However, this controller operates on a 3-D non-linear model of human motion that is closer to true body dynamics than 2-D linear models.