We introduce an online adaptive algorithm for learning gesture models. By learning gesture models in an online fashion, the gesture recognition process is made more robust, and the need to train on a large training ensemble is obviated. Hidden Markov models are used to represent the spatial and temporal structure of the gesture. The usual output probability distributions --- typically representing appearance --- are trained at runtime exploiting the temporal structure (Markov model) that is either trained off-line or is explicitly hand-coded. In the early stages of runtime adaptation, contextual information derived from the application is used to bias the expectation as to which Markov state the system is in at any given time. We describe the Watch and Learn system, a computer vision system which is able to learn simple gestures online for interactive control.