TR#477: Tools for Browsing a TV Situation Comedy Based on Content Specific Attributes

Joshua S. Wachman and Rosalind W. Picard

Journal of MultiMedia Tools and Applications

This paper presents general purpose video analysis and annotation tools, which combine high-level and low-level information, and which learn through user interaction and feedback. The use of these tools is illustrated through the construction of two video browsers, which allow a user to fast forward (or rewind) to frames, shots, or scenes containing a particular character, characters, or other labeled content. The two browsers developed in this work are: (1) a basic video browser, which exploits relations between high-level scripting information and closed captions, and (2) an advanced video browser, which augments the basic browser with annotations gained from applying machine learning. The learner helps the system adapt to different peoples' labelings by accepting positive and negative examples of labeled content from a user, and relating these to low-level color and texture features extracted from the digitized video. This learning happens interactively, and is used to infer labels on data the user has not yet seen. The labeled data may then be browsed or retrieved from the database in real time. An evaluation of the learning performance shows that a combination of low-level color signal features outperforms several other combinations of signal features in learning character labels in an episode of the TV situation comedy, SEINFELD. We discuss several issues that arise in the combination of low-level and high-level information, and illustrate solutions to these issues within the context of browsing television sitcoms.

NOTE: OUR LICENSE AGREEMENT TO USE IMAGERY FROM THE TV SHOW SEINFELD IS RESTRICTED. THE ON-LINE VERSION OF THIS PAPER HAS THE IMAGES INTENTIONALLY DISTORTED. IF YOU WOULD LIKE A FULL VERSION OF THE PAPER WITH IMAGES UNDISTORTED, YOU ARE DIRECTED TO THE ARTICLE AS PUBLISHED IN THE JOURNAL OR TO SEND AN EMAIL REQUEST TO THE LAB.

[keywords: computer assisted learning, video pattern recognition, video annotation, SOCIETY OF MODELS, FOUREYES, content-based retrieval.]


Compressed Postscript . PDF . Full list of tech reports