Photobook

Photobook is a tool for performing queries on image databases based on image content. It works by comparing features associated with images, not the images themselves. These features are in turn the parameter values of particular models fitted to each image. These models are commonly color, texture, and shape, though Photobook will work with features from any model. Features are compared using one out of a library of matching algorithms that Photobook provides. In version 5, these include euclidean, mahalanobis, divergence, vector space angle, histogram, Fourier peak, and wavelet tree distances, as well as any linear combination of these. Version 6 allows user-defined matching algorithms via dynamic code loading.

Since there are no image models which are optimal for all tasks and it is rarely clear which models are appropriate for a task, Photobook includes FourEyes, an interactive learning agent which selects and combines models based on examples from the user. This makes Photobook different from tools like QBIC, Virage, SWIM, and CORE, which all support search on various features but offer little assistance in actually choosing one for a given task. FourEyes, by contrast, allows users to directly address their intent.

Example uses of Photobook:

Photobook is available free by FTP. It runs under the UNIX/Linux operating system. The distribution contains very little feature extraction code; you have to provide that yourself. In particular, no face recognition code is provided. For face recognition code, you should look in /pub/eigenfaces/ or get face-recognition.tar.Z. Send mail to Alex Pentland for more face recognition info. If you are looking for heavy-duty image retrieval software that runs out of the box, check out the commerical products listed in the Northumbria report.

Publications (also see FourEyes):

Photobook: Tools for Content-Based Manipulation of Image Databases
A. Pentland, R. Picard, and S. Sclaroff,
SPIE Storage and Retrieval of Image & Video Databases II, Feb 1994
TR #255

The Web demo is not available anymore since I am no longer at MIT to maintain it.

Demo databases:

VisTex
A photograph collection of 365 natural textures, cut into blocks of four. Histogram: Euclidean distance between 3*256 color bins. The images were transformed into the Ohta color space (SVD of color cube) beforehand. SAR: Mahalanobis distance between 15 2D noncausal autoregressive parameters, estimated via least-squares. The per-image covariance matrices were estimated from the parameters of multiple, overlapping windows.
Faces
Normalized, front-on photos of 7561 people. EV: Euclidean distance between the top 20 eigenface parameters, estimated via KLT from 100 randomly chosen training images.
FERET
US Army FERET Database of 310 people. Norm-ev: automatic face processing.

Other retrieval pages:


Thomas P. Minka
Last modified: Thu Mar 14 13:35:54 EST 2002