TR#402: The Representation and Recognition of Action Using Temporal Templates

James W. Davis and Aaron F. Bobick

Appears in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR'97)

A new view-based approach to the representation and recognition of action is presented. The basis of the representation is a temporal template -- a static vector-image where the vector value at each point is a function of the motion properties at the corresponding spatial location in an image sequence. Using 18 aerobics exercises as a test domain, we explore the representational power of a simple, two component version of the templates: the first value is a binary value indicating the presence of motion, and the second value is a function of the recency of motion in a sequence. We then develop a recognition method which matches these temporal templates against stored instances of views of known actions. The method automatically performs temporal segmentation, is invariant to linear changes in speed, and runs in real-time on a standard platform. We recently incorporated this technique into the KidsRoom: an interactive, narrative play-space for children.