Before the system attempts to locate people in a scene, it must learn the
scene. To accomplish this Pfinder begins by acquiring a sequence of
video frames that do not contain a person. Typically this sequence is
relatively long, a second or more, in order to obtain a good estimate of
the color covariance associated with each image pixel. For computational
efficiency, color models are built in both the standard (Y,U,V) and
brightness-normalized color spaces.