Human skin forms a dense manifold in color space which makes it an easy feature to detect in images [10]. We obtain multiple training samples of skin from images of several individuals of varying skin tone and under varying illumination conditions. Each pixel in this distribution forms a 3 element vector, [R G B]. We perform clustering on this distribution of pixels using Expectation Maximization to find a probability distribution model for skin colors. This model is a mixture of Gaussians and cross-validation is used to determine the appropriate number of Gaussians to use in the EM algorithm. The probability distribution model we used is shown in Figure 2 and is described by Equation 1 where is an (R,G,B) vector.
When a new image is acquired, the likelihood of each pixel is evaluated using this model and if it is above a threshold of probability, it is labeled as skin. Then, a connected component analysis is used to determine the regions of skin pixels in the image. This process is demonstrated in Figure 3. The largest skin blob is then processed further to search for facial features. It is possible to consider the smaller skin blobs as well in case the face is not the largest skin-colored object in the scene.