next up previous
Next: Prompt Selection Up: Implementation Previous: Model Training

Topic Classification


  
Figure: Plot of class probabilities.
\begin{figure}\psfig{figure=topics.eps,height=2in,width=3.3in}\end{figure}

After training data is collected and class models are built, the system begins receiving audio input from speakers. A matching algorithm sequentially updates a conversation history (${\bf x}$) which counts the frequency of most recently spoken words and weights them by their recency (which is slowly decaying). The conversation history ${\bf x}$ (i.e. a 30,000 dimensional vector of counts of past words), is updated at each step after receiving a new word wordk by decaying ${\bf x}$ and adding a count of one for the new word:

 \begin{displaymath}x_i^t = \alpha x_i^{t-1} + \delta(k, i)
\end{displaymath} (2)

where $\alpha$ is the decay parameter, $\delta(k, i)$ equals 1 if the the wordk is the same word as xi (i.e. i = k). Given the conversation history at time t, its class-conditional probability is computed as follows:

 \begin{displaymath}P({\bf x}\vert c) = \prod_i P(word_i\vert c)^{x_i}
\end{displaymath} (3)

This probability is converted into the posterior for topic c using Bayes' rule. The prior probabilities P(c) are scalars (one per topic class) estimated by cross-validation:

 \begin{displaymath}P(c\vert{\bf x}) = {P({\bf x}\vert c) P(c)
\over \sum\limits_{k=1}^{C} P({\bf x}\vert k) P(k)}
\end{displaymath} (4)

Fig. 2 shows class probabilities for the ongoing conversation. After these probabilities are computed for each class the most likely topic c is selected and the corresponding feedback is given to the users as described below.


next up previous
Next: Prompt Selection Up: Implementation Previous: Model Training
Tony Jebara
2000-08-17