Related Work

Next: Mathematical Formulation Up: Introduction Previous: Introduction

Related Work

In recent years there has been much interest in tracking the human body using 3-D models with kinematic and dynamic constraints. Perhaps the first efforts in body tracking were by Badler and O'Rourke 1980[15], followed by Hogg 1988 [14], and other variations on their basic method[20,3]. With the exception of Badler, these early efforts employed 2-D kinematic models of the human body driven by edge information.

Gavrila and Davis [8] and Rehg and Kanade [19] used 3-D kinematic models driven by edge data from multiple cameras in an analysis-synthesis framework. Both of these systems used the kinematic models to deal with limited occlusions, and thus could begin to handle a greater range of body motions.

In parallel to that work, some researchers began using dynamic models to track the human body. Pentland and Horowitz 1991 employed non-rigid finite element models driven by optical flow [16], and Metaxas and Terzopolous's 1993 system employing deformable superquadrics [10,13] driven by 3-D point and 2-D edge measurements. More recently Bregler[5] has combined dynamic techniques with 2-D, region-based features with good results.

These systems all suffer from problems with self-occlusion. In general, they either rely on expensive search techniques to find consistent solutions or simply drop occluded body parts. This situation is exacerbated by the common use of the analysis-synthesis framework that forces early processing stages to make decisions without the benefit of context that is readily available in other parts of the system. Isard and Blake[9] call this the problem of ambiguous data. They suggest the Condensation method that avoids this search by using a probabilistic framework to carry a multitude of possible hypotheses. Unfortunately it requires the propagation hundreds to thousands of hypotheses to track even a single hand, and thus remains very far from real-time for this problem.

Recent work by Metaxas[11] comes close to breaking the barrier between ambiguous features and the context needed to resolve them by using dynamic predictions to drive a view-point selection algorithm, but the feature computation itself is a blind feed-forward process.

This work combines 3-D dynamic and kinematic models with region-based features to solve the body tracking problem. It is unique in two important ways: it is a fully recursive formulation and it is also completely real-time. As we will explain below, this recursive framework makes the system robust to occlusions and numerous other visual ambiguities.

Next: Mathematical Formulation Up: Introduction Previous: Introduction

1999-02-13