# Kronos: Model-Based Head Tracking

### What It Does

Kronos is a system that tracks the rigid motion of heads in 3D from a
single 2D camera view. It can very accurately recover the 3D
translations and rotations of the head (see Ground Truth section
below) and is stable over hundreds of frames.
### How It Does It

Kronos first automatically fits a 3D ellipsoid to the head in the
first frame of the sequence using the feature positions produced by
the modular eigenspaces work of Baback
Moghaddam and Alex P. Pentland.
To compute the motion of the model from the current to the next frame,
the optical flow between the frames is first computed. The six rigid
parameters (three rotations, three translations) of the ellipsoid are
then iterated about their current position. The "model flow" is
defined as the flow resulting from moving the model from its current
parameters to a given set of iterated parameters. A robust error norm
is used to compare this model flow with the actual optical flow. The
set of parameters with the (locally) smallest error is chosen as the
model parameters for the next frame. This process is then continued
for the next set of frames.

The images at the top of this page show several key frames from a
hundred frame sequence. The first row of images are the original
input frames. The second row shows the ellipsoid model with the
current estimated parameters superimposed on the original frames.

### Who Developed It

This research was done by Sumit
Basu, Irfan A. Essa, and Alex P. Pentland.
### Acknowledgements

This material is based upon work supported in part by a National
Science Foundation Graduate Fellowship. We also gratefully
acknowledge our corporate sponsors, especially British Telecom, which
has worked closely with us on parts of this project in terms of both
research and funding.
### Where To Get More Information

Download Vismod Technical Report #362, *Motion Regularization for
Model-based Head-Tracking*, from our
tech-reports page.
### Ground Truth Sequence

To demonstrate the accuracy of our system, we generated a sequence for
which the rigid parameters of the head were known exactly for each
frame. Using computer animation, a texture-mapped head was moved
around on a real background. This head was then tracked using the
ellipsoidal model. The head was also tracked using a 2D planar model
to demonstrated the advantages of the full 3D model. As you can see
in the sequences, while the point to point correspondence (i.e., a
point on the mesh to a point on the face) is good for both models, the
3D parameters are much more accurate for the ellipsoidal model (see
the plots). Note
the angles (alpha, beta, and gamma) in particular.
Click on the following to see the corresponding QuickTime MOV movies:

the original
synthetic sequence

tracking with
the ellipsoidal model

tracking with
the planar model

Click here to see plots of each of the
following rigid parameter values for the original sequence, the
ellipsoidal model, and the planar model:

- alpha (rotation about the
object's z axis)
- beta (rotation about the
object's y axis)
- gamma (rotation about the
object's x axis)
- x (translation along the
global x axis)
- y (translation along the
global y axis)
- z (translation along the
global z axis)

### Real Sequences

Click on the following images to see QuickTime MOV movies of the
corresponding sequences (with the ellipsoidal model superimposed on
each frame).

Sumit

30 FPS, HandyCam, 320x240 images

Irfan

30 FPS, HandyCam, 320x240 images

Chris

30 FPS, HandyCam, 320x240 images

Yaser

15 FPS, unknown camera type, 280x210 images

Sumit

5 FPS, IndyCam, 90x90 images

Back up to the FaceView menu