A Synthetic Character with Speech and Vision
Toco the Toucan is a synthetic creature created at the MIT Media Laboratory. Toco combines speech recognition, computer vision, machine learning and behavior-based animation to create an autonomous character who interacts with people using natural speech and gesture. Speech Recognition and Learning Toco employs speaker- independent phoneme recognition based on hidden Markov models and artificial neural networks to recognize spoken utterances. Based on machine learning techniques, Toco can acquire and subsequently use new words on the fly. Computer Vision Toco uses statistical models of human skin color to track a person's hands in real time, allowing him to understand simple hand gestures such as pointing. The current vision system uses a two camera configuration and triangulation to recover 3-D depth information. Behavior-Based Animation Toco's internal control mechanisms are structured as a loose hierarchy of simple behaviors which interact with perceptual events and internal state variables to produce unpredictable and life-like behavior. |
Deb Roy
dkroy@media.mit.edu Project Lead Speech Recognition Tony Jebara jebara@media.mit.edu Computer Vision |
Michal Hlavac
hlavac@media.mit.edu Creature Architecture Bill Tomlinson badger@media.mit.edu Graphics Animation |
|
Christopher Wren wren@media.mit.edu Computer Vision Systems |
Prof. Alex Pentland
sandy@media.mit.edu Faculty Advisor |