A synthetic character taking direction from a human user who is being
tracked in 3-D with stereo vision
A literal mapping is one that treats the tracking features as exactly what they are: evidence about the physical configuration of the user in the real world. In this context the tracking information becomes useful for understanding simple pointing gestures. With quite a bit more work, systems can use this information to estimate a more complete picture of the user's configuration.
Azarbayejani and Pentland[3] are currently building a system which combines Pfinder-based monocular tracking systems into a wide-baseline stereo system called the STereo Interactive Virtual Environment (STIVE). STIVE is capable of resolving 3-D position and orientation from the given 2-D position and orientation tracking.
Wren and Pentland have recently combined STIVE with a literal mapping between user configuration and corresponding parts of an animated character to create an animation-by-example system. The system allows the user to animate the upper body movements of a virtual puppet by executing the corresponding motions (see Figure 5.2). The features from the vision system drive a dynamic human-body model inside the puppet.