This thesis presents new theory and technology for the representation and recognition of complex, context-sensitive human actions in interactive spaces. To represent action and interaction a symbolic framework has been developed based on Roger Schank's conceptualizations, augmented by a mechanism to represent the temporal structure of the sub-actions based on Allen's interval algebra networks. To overcome the exponential nature of temporal constraint propagation in such networks, we have developed the PNF propagation algorithm based on the projection of the IA-networks into simplified, 3-valued (past, now, future) constraint networks called PNF-networks.
The PNF propagation algorithm has been applied to an action recognition vision system that handles actions composed of multiple, parallel threads of sub-actions, in situations that can not be efficiently dealt by the commonly used temporal representation schemes such as finite-state machines and HMMs. The PNF propagation algorithm is also the basis of interval scripts, a scripting paradigm for interactive systems that represents interaction as a set of temporal constraints between the individual components of the interaction. Unlike previously proposed non-procedural scripting methods, we use a strong temporal representation (allowing, for example, mutually exclusive actions) and perform control by propagating the temporal constraints in real-time.
These concepts have been tested in the context of four projects involving story-driven interactive spaces. The action representation framework has been used in the Intelligent Studio project to enhance the control of automatic cameras in a TV studio. Interval scripts have been extensively employed in the development of "SingSong", a short interactive performance that introduced the idea of live interaction with computer graphics characters; in "It/I", a full-length computer theater play; and in "It", an interactive art installation based on the play "It/I" that realizes our concept of immersive stages, that is, interactive spaces that can be used both by performers and public.