TR#349: Interactive learning using a "society of models"

Thomas P. Minka and Rosalind W. Picard

Special issue of Pattern Recognition on Image Databases, 30(4), 1997 (won Best Paper Award)
IEEE Conference on Computer Vision and Pattern Recognition (CVPR'96), pp447-452 (reduced version)

Digital library access is driven by features, but features are often context-dependent and noisy, and their relevance for a query is not always obvious. This paper describes an approach for utilizing many data-dependent, user-dependent, and task-dependent features in a semi-automated tool. Instead of requiring universal similarity measures or manual selection of relevant features, the approach provides a learning algorithm for selecting and combining groupings of the data, where groupings can be induced by highly specialized and context-dependent features. The selection process is guided by a rich example-based interaction with the user. The inherent combinatorics of using multiple features is reduced by a multistage grouping generation, weighting, and collection process. The stages closest to the user are trained fastest and slowly propagate their adaptations back to earlier stages. The weighting stage adapts the collection stage's search space across uses, so that, in later interactions, good groupings are found given few examples from the user. Described is an interactive-time implementation of this architecture for semi-automatic within-image segmentation and across-image labeling, driven by concurrently active color models, texture models, or manually-provided groupings.

Postscript . PDF . Full list of tech reports