A new view-based approach to the representation and recognition of action is presented. The work is motivated by the observation that a human observer can easily and instantly recognize action in extremely low resolution imagery with no strong features or information about the three-dimensional structure of the scene. Our underlying representations for action are view-based descriptions of the coarse image motion. Using these descriptions, we propose an appearance-based recognition strategy embedded within a hypothesize-and-test paradigm.
A binary motion region (BMR) image is initially computed to act as an index into the action library. The BMR grossly describes the spatial distribution of motion energy for a given view of a given action. Any stored BMRs that plausibly match the unknown input BMR are then tested for a coarse, categorical agreement with a known motion model of the action.
We have developed two motion-based methods for the verification of the hypothesized actions. The first approach collapses the temporal variations of region-based motion parameters into a single, low-order coefficient vector. A statistical acceptance region generated around the coefficients is used for classification into the training instances. In the second approach, a motion history image (MHI) is the basis of the representation. The MHI is a static image where pixel intensity is a function of the recency of motion in a sequence. Recognition is accomplished in a feature-based statistical framework.
Results employing multiple cameras show reasonable recognition within a MHI verification method which automatically performs temporal segmentation, is invariant to linear changes in speed, and runs in real-time on a standard platform.