Vision and Modeling Group

MIT Media Laboratory


Computers Watching Football


Video Annotation

The Scenario

player.gif (17166 bytes)The coach is concerned that though the team has won 9 of its last 10 games, it has done so on the strength of the field goal kicker. The offense has been unable to reliably move the ball once inside the "red zone" --- the area of the field inside the opponent's 20 yard line.

In preparation for the big game next week, the head coach asks the Video Athletic Coordinator (VAC) for some compilation tapes. The tapes should contain every offensive play over the last three years in which the ball is within the red zone, it is 2nd or 3rd down, there is more than 5 yards to go, and the team executed a running play. Furthermore, since he is going to use the tapes to review particular plays with the team, he'd like to separate the draw plays from the sweeps from the traps. The VAC types the necessary information into the computer, and a short while later hands the coach the separate tapes he asked for.

Football Annotation

The coaching scenario above is not fiction. Every university with a significant football program as well as every professional football team has a Video Athletic Coordinator and a video database of all the games played. The video is recorded using a camera with a high vantage point that is controlled by a cameraman tasked with keeping all the players in the field of view. Using specialized database software, the VAC manually annotates every play, recording attributes such as yard line, down, yards to go, formation, type of play executed, and result. These descriptions, along with timecode information, are used to automatically edit input tapes into the necessary compilations. The NFL has recently converted to using all digital media so that the video can be accessed and viewed directly from a computer.

Video Annotation

Video annotation is the task of generating such descriptions. It is different than conventional computer vision image understanding in that one is primarily interested in what is happening in a scene, as opposed to what is in the scene. The goal is to describe the behavior or action that takes place in a manner relevant to the domain. In the "football domain," we would like to build a computer system that will automatically annotate video automatically or provide a semi-automatic process for the VAC.

Video annotation is a problem that will become much more important in the next few years as video databases begin to grow and methods must be developed for automatic database summary, analysis, and retrieval. Other annotation problems being studied in the Vision and Modeling Group of the MIT Media Lab include dance steps and human gesture .

We have chosen to study the automatic annotation of football plays for four reasons: (1) football has a known descriptive language, (2), football has a rich set of domain rules and domain expectations, (3), football annotation is a real-world problem, and (4) it's fun.

diagram.gif (5348 bytes)The descriptive language is the football playbook. Players, coaches, and fans have developed a categorization system that includes virtually all possible plays . The classification problem is difficult, however, because distinctions between play types can be subtle and there is a significant amount of variation of player movement between plays within the same category. Fortunately, a football game is governed by the rules of the game and expected events. These rules and likelihoods must be used to identify the key events in a play that can be used to assign the most appropriate play label.

Automatic or semi-automatic video annotation is not the prototypical computer vision problem, but it will become increasingly important as access to video databases increases

Annotation Input

An automatic football annotation system must have some input data upon which to make a preliminary play hypothesis. In the football annotation problem, we are using player trajectories. In the first stage of our annotation project, we have implemented a computer vision football-player tracker that uses contextual knowledge to track football players as they move around a field.


A football play annotation system has many uses from generating "chalkboard" diagrams for network newscasters (better than the current chalkboard systems) to home "call-the-play" television sets that score living room coaches on how well they predict each play. The most direct application is to aid the VAC in the tedious play annotation task and to provide better data to professional, college, and high school coaches.

Computers Watching Football Home

Visual Tracking

Last modified: April 06, 1999