By projecting a new mug-shot into the span of these eigenvectors, we can compute its coefficients in this new basis as well as the residual error. We also approximate its distance to the training set of faces (distance to face-space) or how 'face-like' it is using this representation [8]. The training set of faces is mapped into this eigenspace and the distribution of the coefficients and residuals is modeled as a Gaussian density. The maximum likelihood estimate for the probability of a data point fitting this model is computed using this Gaussian. This gives us a measure of the 'faceness' or how face-like a given mug-shot is (or, conversely, an image with 4 anchor points as it is warped into a mug-shot).
Now, refer to Figure 7(a). Up until now, detection should have recovered a combination of eyes, mouth and nose vertical height. However, it is still uncertain where the exact horizontal position of the nose was on the face. Thus, we attempt 12 different normalizations and K-L projections along the horizontal line across the nose's bottom. The 12 candidate nose anchor points along this line generate 12 normalized mug-shots and their 'distances to face-space'. These are shown as we test for a nose along each point on the horizontal line (Figure 8). Face 0 is generated by setting the nose anchor point all the way to the left of the nose-bottom-line and Face 12 is generated by the anchor point on the right tip of the line. The normalized face vector with the highest 'faceness' probability corresponds to the best possible nose localization (i.e. minimal DFFS).
The final position of the eyes, nose and mouth are shown in Figure 7(b). If time is not critical, we suggest using search or optimization techniques to refine the position of these locations by searching locally for the 3D normalization that minimizes distance to face-space.
The time required for detecting facial feature points is of the order of 1 second. Having found a face and facial feature points that meet a threshold on our 'faceness' measure, we can initialize the tracking system appropriately. Note that, if the face detector was slower than 0.5 to 1 Hz, the tracking could not be initialized properly because the face will probably move away from the localization during the time the detection was being computed.