We present several approaches to the machine perception of motion and discuss the role and levels of knowledge in each. In particular we describe different techniques of motion understanding as focusing on one of movement, activity, or action. Movements are the most atomic primitives, requiring no contextual or sequence knowledge to be recognized; movement is often addressed using either view- invariant or view specific geometric techniques. Activity refers to sequences of movements or states, where the only real knowledge required is the statistics of the sequence; much of the recent work in gesture understanding falls within this category of motion perception. Finally, actions are larger scale events which typically include interaction with the environment and causal relationships; action understanding straddles the gray division between perception and cognition, computer vision and artificial intelligence. One distinction between these levels is the degree to which time must be explicitly represented and manipulated, ranging from simple linear scaling of speed to constraint-based reasoning on temporal intervals. We illustrate these levels with examples drawn mostly from our work in understanding motion in video imagery and argue the utility of such a division is that it makes explicit the representational competencies and manipulations necessary for perception.