When we began work on recognition of play
labels, we started by developing a system to automatically determine which players were
which given a starting input of object positions. This is an interesting problem because
every player's position type depends upon the position type of other agents, and noise in
the input data can make it difficult to detect seemingly simple concepts like
"behind" and "near." The formation labeling problem seemed a good
place to start because it did not require temporal reasoning, like action recognition
does, but it required a system that could deal with the "chicken and egg"
problem: how do I know who is the QB if I don't know who the Center is, but how can I
figure out who the center is without knowing who the Quarterback is?
Initially the recognition work, motivated by
the work of Strat and Fischler  progressed in the direction of finding a consistent
interpretation of the scene using a non-probabilistic rule-based system. A context-based
recognition system modeled on Strat's CONDOR system was coded to try to recognize the
offensive starting formation given the (x , y) positions of the players on the field. The
image below shows the data. The system uses context sets and a rule base of several
hundred rules about relative spatial relationships and rules about football formation
configurations to gradually build a consistent interpretation of the data. The data-driven
process uses a rule base to propose hypothesis, a second rule base to to rank all
hypothesis of the same type, and a third rule base to check if a given hypothesis is
consistent with the current interpretation. In the middle image below the system has found
a hypothesis for the line of scrimmage position. Eventually a consistent (and correct)
labeling of the formation is found, shown in the right image. Other consistent labelings
are also found, but the one shown explains all of the given data.
The input object positions.
One hypothesis for the location of the LOS.
A final consistent labeling of the player positions.
This formation labeling scheme
does not include a temporal representation. Initially the intent was to modify this system
for general play recognition, but further evaluation of the system's representation
problems towards the current approach described in the Action Recognition section. The
formation system demonstrates that it is possible to label the formations given the
approximate starting locations of the players. An nice extension to this work would be to
automatically detect the starting positions (as shown in the left image) from images of
starting formations, and use this system to make that challenging visual processing task
easier. The positions used for this system were obtained manually.