Distributed Analysis and Representation of Visual Motion

Eero Peter Simoncelli

Published as:
Doctoral Thesis
MIT Department of Electrical Engineering and Computer Science
Cambridge, MA
January, 1993

This thesis describes some new approaches to the representation and analysis of visual motion, as perceived by a biological or machine visual system. We begin by discussing the computation of image motion fields, the projection of motion in the three-dimensional world onto the two-dimensional image plane. This computation is notoriously difficult, and there are a wide variety of approaches that have been developed for use in image processing, machine vision, and biological modeling. We show that a large number of the basic techniques are quite similar in nature, differing primarily in conceptual motivation, and that they each fail to handle a set of situations that occur commonly in natural scenery.

The central theme of the thesis is that the failure of these algorithms is due primarily to the use of vector fields as a {\em representation} for visual motion. We argue that the translational vector field representation is inherently impoverished and error-prone. Furthermore, there is evidence that a direct optical flow representation scheme is not used by biological systems for motion analysis. Instead, we advocate {\em distributed} representations of motion, in which the encoding of image plane velocity is implicit.

As a simple example of this idea, and in consideration of the errors in the flow vectors, we re-cast the traditional optical flow problem as a probabilistic one, modeling the measurement and constraint errors as random variables. The resulting framework produces {\em probability distributions} of optical flow, allowing proper handling of the uncertainties inherent in the optical flow computation, and facilitating the combination with information from other sources. We demonstrate the advantages of this probabilistic approach on a set of examples. In order to overcome the temporal aliasing commonly found in time-sampled imagery (eg, video), we develop a probabilistic ``coarse-to-fine'' algorithm that functions much like a Kalman filter over scale. We implement an efficient version of this algorithm and show its success in computing Gaussian distributions of optical flow for both synthetic and real image sequences.

We then extend the notion of distributed representation to a generalized framework that is capable of representing multiple motions at a point. We develop an example representation through a series of modifications of the differential approach to optical flow estimation. We show that this example is capable of representing multiple motions at a single image location and we demonstrate its use near occlusion boundaries and on simple synthetic examples containing transparent objects.

Finally, we show that these distributed representation are effective as models for biological motion representation. We show qualitative comparisons of stages of the algorithm with neurons found in mammalian visual systems, suggesting experiments to test the validity of the model. We demonstrate that such a model can account quantitatively for a set of psychophysical data on the perception of moving sinusoidal plaid patterns.