We trained the system using newsgroup and web data (an
average of 150,000 words per topic) and attempted to recover the
currently active topic out of the twelve candidates. As depicted, in
Fig. 2, the speakers discussed three topics in
the following order: 'intlcourtofjustice', 'talk.religion.misc', and
'alt.jobs'. About 100 words per topic were uttered and the system
converged to the correct topics. Only the transitions caused some
confusion as the speakers migrated from one subject to another (this
could be reduced by varying the parameter
which was set to 0.95). If
transition errors are counted, the system has an
accuracy of
.
Naturally, in steady state, the system correctly
identified all 3 topics.
After the topic is detected, the most appropriate prompt is determined and shown to the users on the large screen display (see Fig. 1). The video camera is used to evaluate how ``smoothly'' the conversation progresses and if the users are searching for prompts. We use a detection of a full frontal view of a user as a cue that the user is requesting assistance.