Computers Watching Football - Formation Labeling

When we began work on recognition of play labels, we started by developing a system to automatically determine which players were which given a starting input of object positions. This is an interesting problem because every player's position type depends upon the position type of other agents, and noise in the input data can make it difficult to detect seemingly simple concepts like "behind" and "near." The formation labeling problem seemed a good place to start because it did not require temporal reasoning, like action recognition does, but it required a system that could deal with the "chicken and egg" problem: how do I know who is the QB if I don't know who the Center is, but how can I figure out who the center is without knowing who the Quarterback is?

Initially the recognition work, motivated by the work of Strat and Fischler [1991] progressed in the direction of finding a consistent interpretation of the scene using a non-probabilistic rule-based system. A context-based recognition system modeled on Strat's CONDOR system was coded to try to recognize the offensive starting formation given the (x , y) positions of the players on the field. The image below shows the data. The system uses context sets and a rule base of several hundred rules about relative spatial relationships and rules about football formation configurations to gradually build a consistent interpretation of the data. The data-driven process uses a rule base to propose hypothesis, a second rule base to to rank all hypothesis of the same type, and a third rule base to check if a given hypothesis is consistent with the current interpretation. In the middle image below the system has found a hypothesis for the line of scrimmage position. Eventually a consistent (and correct) labeling of the formation is found, shown in the right image. Other consistent labelings are also found, but the one shown explains all of the given data.


The input object positions.	One hypothesis for the location of the LOS.	A final consistent labeling of the player positions.

This formation labeling scheme does not include a temporal representation. Initially the intent was to modify this system for general play recognition, but further evaluation of the system's representation problems towards the current approach described in the Action Recognition section. The formation system demonstrates that it is possible to label the formations given the approximate starting locations of the players. An nice extension to this work would be to automatically detect the starting positions (as shown in the left image) from images of starting formations, and use this system to make that challenging visual processing task easier. The positions used for this system were obtained manually.