TR#360: A Society of Models for Video and Image Libaries

Rosalind W. Picard

Submitted to IBM Systems Journal

The average person with a computer will soon have access to the world's collections of digital video and images. However, unlike text which can be alphabetized or numbers which can be ordered, image and video has no general language to aid in its organization. Although tools which can ``see'' and ``understand'' the content of imagery are still in their infancy, they are now at the point where they can provide substantial assistance to users in navigating through visual media. This paper describes new tools based on ``vision texture'' for modeling image and video. The focus of this research is the use of a society of low-level models for performing relatively high-level tasks, such as retrieval and annotation of image and video libraries. This paper surveys our recent and present research in this fast-growing area.

Compressed Postscript . PDF . Full list of tech reports