next up previous
Next: 2D Feature Tracking Up: Facial Feature Detection Previous: 3D Facial Pose and

Eigenspace Distance Measures on 3D Warped Faces

A database of colour face images was collected and for each image the locations of the facial features were manually identified. These loci were then used to generate normalized mug-shots as explained above. In addition, the loci were perturbed with random spatial noise to generate multiple mug-shots of each face with slightly misaligned feature locations. This makes the eigenspace sligthly less sensitive to precise feature localization. A colour eigenspace of these normalized mug-shots is constructed and the mean face and the first 4 eigenfaces are shown in Figure 6.


  
Figure: The Mean Face and the First 4 Eigenfaces
\begin{figure}\center
\begin{tabular}[b]{ccccc}
\epsfysize=0.6in
\epsfbox{me...
...{eigen2.ps} &
\epsfysize=0.6in
\epsfbox{eigen3.ps}
\end{tabular}\end{figure}

By projecting a new mug-shot into the span of these eigenvectors, we can compute its coefficients in this new basis as well as the residual error. We also approximate its distance to the training set of faces (distance to face-space) or how 'face-like' it is using this representation [8]. The training set of faces is mapped into this eigenspace and the distribution of the coefficients and residuals is modeled as a Gaussian density. The maximum likelihood estimate for the probability of a data point fitting this model is computed using this Gaussian. This gives us a measure of the 'faceness' or how face-like a given mug-shot is (or, conversely, an image with 4 anchor points as it is warped into a mug-shot).


  
Figure: Localization
\begin{figure}\center
\begin{tabular}[b]{cc}
\epsfysize=1.1in
\epsfbox{kenX....
... (a) Initial Localization &
(b) Final Localization
\end{tabular}\end{figure}

Now, refer to Figure 7(a). Up until now, detection should have recovered a combination of eyes, mouth and nose vertical height. However, it is still uncertain where the exact horizontal position of the nose was on the face. Thus, we attempt 12 different normalizations and K-L projections along the horizontal line across the nose's bottom. The 12 candidate nose anchor points along this line generate 12 normalized mug-shots and their 'distances to face-space'. These are shown as we test for a nose along each point on the horizontal line (Figure 8). Face 0 is generated by setting the nose anchor point all the way to the left of the nose-bottom-line and Face 12 is generated by the anchor point on the right tip of the line. The normalized face vector with the highest 'faceness' probability corresponds to the best possible nose localization (i.e. minimal DFFS).


  
Figure: The 3D Normalized Faces for Various Trial Nose Positions and their Corresponding DFFS
\begin{figure}\center
\setlength \tabcolsep{1pt}
\begin{tabular}[b]{ccccccccccc...
...34 & .32 & .43 & .36 & .32 & .19 & .16 & .17 & .24
\par\end{tabular}\end{figure}

The final position of the eyes, nose and mouth are shown in Figure 7(b). If time is not critical, we suggest using search or optimization techniques to refine the position of these locations by searching locally for the 3D normalization that minimizes distance to face-space.

The time required for detecting facial feature points is of the order of 1 second. Having found a face and facial feature points that meet a threshold on our 'faceness' measure, we can initialize the tracking system appropriately. Note that, if the face detector was slower than 0.5 to 1 Hz, the tracking could not be initialized properly because the face will probably move away from the localization during the time the detection was being computed.


next up previous
Next: 2D Feature Tracking Up: Facial Feature Detection Previous: 3D Facial Pose and
Tony Jebara
1999-12-07