In the open-loop system, the vision system uses a Maximum Likelihood
(ML) framework to label individual pixels in the scene:
To close the loop, we need to incorporate information from the 3-D model. Given the current state of the model , it is possible to compute the state of an individual link that matches a specific tracked feature (say the hand), and call it . Then, given a model of the camera, it is possible to calculate the perspective projection of that state into 2-D and call it .
Since the vision system uses a stochastic framework, it is necessary
to represent this link projection as a statistical model:
.
Integrating this information into the
2-D statistical decision framework results in a Maximum A
Posteriori decision rule: