| Video Annotation The Scenario
 
  The coach is concerned that though the team has won 9 of its last 10
    games, it has done so on the strength of the field goal kicker. The offense has been
    unable to reliably move the ball once inside the "red zone" --- the area of the
    field inside the opponent's 20 yard line. 
 In preparation for the big game next week, the head coach asks the Video Athletic
    Coordinator (VAC) for some compilation tapes. The tapes should contain every offensive
    play over the last three years in which the ball is within the red zone, it is 2nd or 3rd
    down, there is more than 5 yards to go, and the team executed a running play. Furthermore,
    since he is going to use the tapes to review particular plays with the team, he'd like to
    separate the draw plays from the sweeps from the traps. The VAC types the necessary
    information into the computer, and a short while later hands the coach the separate tapes
    he asked for.
 Football Annotation
 The coaching scenario above is not fiction. Every university
    with a significant football program as well as every professional football team has a
    Video Athletic Coordinator and a video database of all the games played. The video is
    recorded using a camera with a high vantage point that is controlled by a cameraman tasked
    with keeping all the players in the field of view. Using specialized database software,
    the VAC manually annotates every play, recording attributes such as yard line, down, yards
    to go, formation, type of play executed, and result. These descriptions, along with
    timecode information, are used to automatically edit input tapes into the necessary
    compilations. The NFL has recently converted to using all digital media so that the video
    can be accessed and viewed directly from a computer.
 
 Video Annotation
 
 Video annotation is the task of generating such descriptions.
    It is different than conventional computer vision image understanding in that one is
    primarily interested in what is happening in a scene, as opposed to what is in the scene.
    The goal is to describe the behavior or action that takes place in a manner relevant to
    the domain. In the "football domain," we would like to build a computer system
    that will automatically annotate video automatically or provide a semi-automatic process
    for the VAC.
 
 Video annotation is a problem that will become much more important in the next few years
    as video databases begin to grow and methods must be developed for automatic database
    summary, analysis, and retrieval. Other annotation problems being studied in the Vision
    and Modeling Group of the MIT Media Lab include dance steps and human gesture .
 
 We have chosen to study the automatic annotation of football plays for four reasons: (1)
    football has a known descriptive language, (2), football has a rich set of domain rules
    and domain expectations, (3), football annotation is a real-world problem, and (4) it's
    fun.
 
 
  The
    descriptive language is the football playbook. Players, coaches, and fans have developed a
    categorization system that includes virtually all possible plays . The classification
    problem is difficult, however, because distinctions between play types can be subtle and
    there is a significant amount of variation of player movement between plays within the
    same category. Fortunately, a football game is governed by the rules of the game and
    expected events. These rules and likelihoods must be used to identify the key events in a
    play that can be used to assign the most appropriate play label. 
 Automatic or semi-automatic video annotation is not the prototypical computer vision
    problem, but it will become increasingly important as access to video databases increases.
 
 Annotation Input
 
 An automatic football annotation system must have some input
    data upon which to make a preliminary play hypothesis. In the football annotation problem,
    we are using player trajectories. In the first stage of our annotation project, we have
    implemented a computer vision football-player tracker that uses contextual knowledge to
    track football players as they move around a field.
 
 Applications
 
 A football play annotation system has many uses from generating
    "chalkboard" diagrams for network newscasters (better than the current
    chalkboard systems) to home "call-the-play" television sets that score living
    room coaches on how well they predict each play. The most direct application is to aid the
    VAC in the tedious play annotation task and to provide better data to professional,
    college, and high school coaches.
 |