TR#311: Modal Matching: A Method for Describing, Comparing and Manipulating Digital Signals

Stan Sclaroff

Media Lab Ph.D. Dissertation, submitted January 1995.

Thesis Advisor: Alex Pentland

Thesis Committee: Tomaso Poggio, Whitman Richards, and Andrew Witkin

This thesis introduces modal matching, a physically-motivated method for establishing correspondences and computing canonical shape descriptions. The method is based on the idea of describing objects in terms of generalized symmetries, as defined by each object's eigenmodes. The resulting modal description is used for object recognition and categorization, where shape similarities are expressed as the amounts of modal deformation energy needed to align two shapes. Modal matching is also used for a physically-motivated linear-combinations-of-models paradigm, where the computer synthesizes a shape in terms of a weighted combination of modally deformed prototype shapes. In general, modes provide a global-to-local ordering of shape deformation and thus allow for selecting the types of deformations used in object alignment and comparison.

In contrast to previous techniques, which required correspondence to be computed with an initial or prototype shape, modal matching utilizes a new type of finite element formulation that allows for an object's eigenmodes to be computed directly from available shape information. This improved formulation provides greater generality and accuracy, and is applicable to data of any dimensionality. Correspondence results with 2-D contour and point feature data are shown. Recognition experiments for image databases are described, in which a user selects example images and then the computer efficiently sorts the set of images based on the similarity of their shape.

While the primary focus of this thesis is matching shapes in 2-D images, the underlying shape representation is quite general and can be applied to compare signals in other modalities or in higher dimensions, for instance in sounds or scientific measurement data.