Real-time tracking was tested on the live video sequence shown in Figure 11. Roughly 2000 frames were tracked without feature-loss (over 1 minute of tracking in real-time). The filtered tracking windows are shown projected on the face. The normalized mug-shot (after 3D warping and illumination correction) is shown at the bottom of Figure 11.
As can be seen, the subject is undergoing large in-plane and out-of plane rotations in all axes as well as partial occlusion (in frame 827). Out-of-plane rotations of over 45 degrees are tolerated without feature loss. Even though almost half of the correlation-based trackers may be occluded under large, out-of-plane rotations, the global EKF filtering maintains tracking using the visible features. Unless very jerky motion is used or extreme out-of-plane rotations are observed, the system maintains tracking and does not exhibit instability. The system has been tested on multiple subjects from live video streams and tracking performance is consistent.
Figure 12(a) displays the typical residual correlation error of a tracking window. However, this noisy behaviour is filtered and a stable estimate of depth structure is obtained in Figure 12(b). The EKF converges quickly to the true underlying 3D geometry despite noisy feature tracking. We also measured the SSD residual between the initial mug-shot (at frame 0) and the current normalized face. Figure 12(d) displays the DFFS value over the sequence which is used as a cue to stop tracking (when DFFS is too large). In this sequence, the threshold was set to a generous value of 0.5 and face detection was not re-used since tracking did not fail. However, if the DFFS value were to exceed 0.5, tracking would stop and detection would search for a new face.