In the open-loop system, the vision system uses a Maximum Likelihood
(ML) framework to label individual pixels in the scene:
To close the loop, we need to incorporate information from the 3-D model. Given the current state of the model ,
it is possible
to compute the state of an individual link that matches a specific
tracked feature (say the hand), and call it
.
Then, given a
model of the camera, it is possible to calculate the perspective
projection of that state into 2-D and call it
.
Since the vision system uses a stochastic framework, it is necessary
to represent this link projection as a statistical model:
.
Integrating this information into the
2-D statistical decision framework results in a Maximum A
Posteriori decision rule: