This correspondence discusses interactive graphics systems driven by visual input. The paper describes the underlying computer vision techniques and presents a theoretical formulation which addresses issues of accuracy, computational efficiency, and compensation for display latency. Experimental results quantitatively compare the accuracy of the visual technique with traditional sensing. An extension to the basic technique to include structure recovery is discussed.