Behavior and Learning Workgroup @ MIT Media Laboratory


* Home

* Schedule

* Notes

* Papers

* Abstracts

* Recent


* Events

* Links

Here are some notes about the talks and some issues that were mentionned and discussed.
Follow the 'Talks' links for abstracts about the presentation.

If you want to add a note about your ideas from the meeting, please mail
and we will paste your comments into this area for future reference.

Date Topic -------------------------Notes&Comments-------------------------
06/27/00 @5:00pm "Autonomous Helicopter Control" with Eric Feron Eric Feron presented his latest work on learning control strategies for autonomous RC helicopter piloting from real human expert pilots. This includes various aerobatics, flight tests, theory, path planning, and collision avoidance situations. The main interest is to produce a command control center for robust hybrid control for aggressive autonomous vehicle motion planning.

Details include the recovery of 12 equations of motions from 12 state variables by sampling from real trajectories (primitives) and stitching these together. In particular, such primitives include trim trajectories that occur when controls are locked such as circling or upward spiraling. The RC pilot generates control data of the form of procession, yaw, pitch and roll in various configurations. Initial work was done to simulate the control and flight trajectory in a simulator which was consistent with real piloting. The real helicopter is now being outfitted with computer, inertial unit, variuos dampening structures, etc. which must all remain withing the 6.5 pound payload to allow aerobic flight. Real experiments will begin in the next months and will have to deal with important variables such as varying air density, pressure and temperature which are the chaotic elements that largely vary helicopter dynamics (i.e. compared to wind which has a much smaller effect on helicopters than regular planes).
06/20/00 @5:30pm "Sparse Greedy Methods for Learning" with Alex Smola Alex Smola discussed a new approach for handling large data sets in regularization networks, SVMs and especially Gaussian Process Regression. The algorithm involves using random small subsets of the data to avoid large matrix inversion. Typically, the kernel matrix can be of the order of m=100,000, making storage O(mxm) very difficult and inversion O(m^3) impossible. Bounds on the approximation error in using the iterated random subset method were proposed and obtain speed improvements of orders of magnitude (i.e. 10-fold). These involved approximation bounds on quadratic forms as well as computation of approximation quality amd approximation rates. In addition, the matrix need not be stored explicitly. The method was demonstrated on the Abalone data set where the age of the Abalone was regressed. This technique can also be used as an alternative to PCA. Alex demonstrated its use on handwritten digit-data where the basis was computed efficiently and corresponds closely to the PCA basis.
06/13/00 @5pm "Using Principled Statistical Methods to Unravel the Genetic Regulatory Networks inside Cells" with Alexander Hartemink Alex presented some novel work in the analysis of gene regulatory networks in cells. By obtaining data of various gene expression levels, it is possible to compare various graphical models that represent their dependencies. Several Bayesian networks can be proposed as directed graphs (which explicate conditional independence in the genes' activity). These are then compared by computing their evidence in a Bayesian sense. The graphs represent probability tables with various conditional independencies. By integrating over all possible parameters, it is possible to determine the evidence for a given graph's configuration. This is a principled way to compare hypothesized models and Alex showed favorable results in describing the interaction of genes in the yeast Galactose cycle. More recently, Alex has explored interactions which are monotonic, positive or negative between the genes to specify the models more accurately.

One issue that was raised was the use of quantization of the gene expression levels. It evidently introduces some noise and data loss (since it is a binary quantization) and results may change if multi-valued quantization or scalar values were used.
06/06/00 @5pm "Humanoid Robots" with Brian Scasselatti Forthcoming
05/30/00 @5pm "Mathematical Models of the Perception of Facial Expressions of Emotion" with Alain Mignault Alain Mignault describe his work in collaboration with Cambridge Basic Research (Nissan), Harvard (Nancy Etcoff), MIT (Alex Pentland), and McGill (Tony Marley). Faces carry much information and one of the primary types of information is in emotion expression. Since the work of Damasio, emotions have been more widely accepted as a necessary component of decision making. In addition, facial expression recognition is feasible across cultures.

Alain stressed the use of Entropy which measures the amount of information as well as the level of concensus in a sample population. Lackamn originally discussed how entropy is related to mean response time. We also note a parabolic relation between similarity and response time as well as a tighter relation with entropy. Also, in human subjects, a linear relation is noted with response time and entropy in categorization of images.

Alain models classification of facial expression imagery using PCA (Kohonen, Anderson, 1977) and a Neural Network. This architecture is used to classify expressions and has a response channel model. Classification produces 70% accuracy. Alain also models similarity measures with a neural network that maps eigendistances to a human scale (1-8). Both architectures exhibit human-like responses and show similar relations between response times, entropy and similarity levels.
05/09/00 @5pm "Relating Human Actions and Intentions: A Look at Eye Movements" with Dario Salvucci Dario Salvucci discussed the use of Hidden Markov Models and the ACT-R framework for characterizing eye-movements. One scenario is in equation analysis where the eye movements are tracked as they foveate to different components of the equation revealing the patterns of the observer. Another scenario is in eye-typing where the eye location is used to type on a keyboard. The use of a hidden markov model allows faster and reliable prediction of the letter that is being foveated. Dario is interested in combining the flexibility of HMMs and their ability to process data with the power of the ACT-R framework and its ability to represent domain knowledge.

For more details, Dario's slides are available online in Powerpoint form:

04/18/00 @5pm "Summary of the Wired Kingdom Media Lab Event" with Behavior Group This meeting was an overview of the Wired Kingdom Symposium that was presented on April 17th, 2000. Various behavior group folks discussed ideas that emerged and proposed future directions to pursue.

Some highlights from the workshop included Diana Reiss' work with dolphins that acquire a simple vocabulary to describe objects they play with. Several stages of acquisition were identified: initially learning the acoustic word's ending, then begininning, then the spectrogram of the word and finally mimicking the word itself. The dolphins start mimicking the sounds and then use the sounds when playing with the appropriate objects. Words are also concatenated when the dolphins play with 2 objects simultaneously.

Peter Narins discussed the golden mole in the Namibian desert which uses seismic sensing to travel from one mound to another and avoid flat desert (the mounds contain most of the biomass and food for the mole rat).

Daniel Zatz showed his automatic remote camera systems for filming wildlife non-intrusively.

Possible collaborations with ethologists were then discussed. For instance, building robotic versions of the animals, or electronic interfaces for them, as well as building sensors and sensor controls to observe them.

The workshop itself is at
04/11/00 @5pm "Report on Workshop on Development and Learning (WDL'2000)" with Tony Jebara Tony reviewed the Workshop on Development and Learning that recently was held in Michigan State University. It united various prominent members of several fields: learning, robotics, developmental psychology, and neuroscience. The emphasis was to describe learning as a developmental process where a system (like a robot or child) would self-program and learn while it interacts with the world. Various issues were brought forth from arguments on innateness vs. experience, modularity, perception, sensing, embodiment and grounding.

Developmental psychologists showed results on children motor skill acquisition, dynamic system modeling and scalloping phenomena in learning. Networks and learning specialists discussed some principles including temporal proximity, audio-visual learning, regularization theory, Darwinian model selection and reentrant maps. Neuroscientists presented work on plasticity in rewiring cortex (i.e. directing retina to auditory cortex), on hippocampus short term memory formation and on Hebbian learning. Roboticists discussed control and reinforcement hybrids, limiting degrees of freedom, linking perception to action and social interaction in robots.

Powerpoint slides are available at .

The workshop itself is at
04/04/00 @5:00pm "Song Learning in Birds" with Don Kroodsma In his talk Dan talked about song learning in different varieties of wrens. Dan talked about eastern and western varieties of marsh wren. Based on his observations, these two varieties, which are virtually impossible to tell apart by their appearance, differ in their songs and behavior. Typically, Eastern wrens show lower plasticity in the son g learning process by learning about 3 times less songs in their lifetime than their western counterparts. Experimental data shows that
a) The variety of the repertoire of the song seems related to the mating choice. Females tend to chose the male with the largest song repertoire.
b) Mating out of their social group, a female usually chooses a male of a higher order in the hyerarchy.

Dan showed that the sequence of wren's songs has a high degree of regularity.
a) In some birds it is almost deterministic.
b) There seems to be some sort of a consorted parametric variation in the sequence. If a song A is followed by a song B, and then the bird sings a slower version of A it will be followed by a slower variation of B.

Dan showed differences in song learning between two types of wrens - sedge wrens and marsh wrens. Marsh wrens seem to be good improvisers. Presented with 10 songs they never heard before they learn up to 60 new songs by adding their improvisations to the collection. They also have local dialects, whereas sedge wrens don't show either of the qualities.

Dialect learning is very prominent in marsh wrens. Moving into a new community they often reject what they have learned so far completely and learn the whole new set of local songs.

In the discussion people hypthesised that the information conveyed not only by each song separately, but also by the sequence as a whole.

Bruce asked why don't people talk so much about vocalization learning in mammals?

03/28/00 @1:30pm "Making Reinforcement Learning Work with Real Robots" with Leslie Pack Kaelbling Leslie Kaelbling discussed various issues with real robotic Q-learning. Some shortcomings that were pointed out were:

1) Function approximation. Many techniques have to approximate the Q-function or value function due to intractably large state spaces. However, function approximation tools are slow and fragile.
a) Neural networks and other function approximators assume that IID samples are given but in Q-learning, this is not the case due to the temporal nature of reinforcement learning.
b) Often one would like a one-shot learner that does not require many data points and iterations.
c) The square error metric on the neural net is not tied into the reinforcement signal.
d) The function being learned may not necessarily be sationary.
Leslie described a k-nearest-neighbour approach where a linear fit is estimated from k-neighbours as long as the current query is within the convex hull of the k-nearest-neighbours. Otherwise, a prior is used, thus preventing extrapolation blow-up.
Sandy Pentland pointed out the possibility of doing dimensionality reduction instead of traditional function approximation.

2) Errors propagate in function approximation into the Q-learning stage and then back again into function approximation yielding a compounding of instability and error.

3) Random walks are bad in Q-learning since they don't necessarily explore the space. Leslie proposes having one conservative exploratory Q-learner actively choosing the policy while another learns passively until it is able to assume control.

4) Q-learning is slow to propagate reward due to the Markov assumption. If reward is forced to propagate back more than one state, it may lead to superstitious behavior.

03/14/00 @4:15pm "Categorical Organization and Machine Perception of Oscillatory Motion Patterns" with Jim Davis Jim Davis gave his thesis defense describing a system for modeling and recognizing various oscillatory motions that arise in humans and animals. Several types of motions where shown (circular, U-shuttle, figure-8) and a hierarchical categorical structure that relates them all and identifies their various degress of complexity. The various motions can be spanned by parametrizing sinusoids in highly structured ways (i.e. amplitudes, frequencies and phase shifts with only a few possible parametric settings). A Fourier analysis is used to recover the parameters from video data for inference and classification. Results are shown on various types of motions, including face motion, human body motion, bird motions and other animal motions.

03/7/00 @5:00pm "Structures and Hierarchies in Bird Memory" with Brian Clarkson Brian Clarkson presented a paper by Dietmar Todt and Henrike Hultsch called "How songbirds deal with large amounts of serial information: retrieval rules suggest a hierarchical song memory". The paper appears in Biological Cybernetics, issue 79 pages 487-500 (1998). Brian started with some background and motivation for the work. His interests are in context modeling via audio-visual features on a wearable computer. The sensor data is modeled as a hierarchical hidden Markov model which learns to automatically segment events and scenes from data (i.e. Brian walking around Cambridge for a day). This hierarchical HMM is closely related to the model that the authors claim is being used by Nightingales in their song learning.

Several stages in the birds life are outline from birth, to early song acquisition until the bird goes through one full year (one full migratory cycle). One important phenomenon is the nightingales are best at learning strings that contain 20-60 songs. Longer string lengths require more repetitions before they can be acquired. In addition, only 75% of the song-types can be imitated if they are heard 15 times so acquisition is not a perfect process. Song structures was also pointed, being split into alpha, beta, gamma and omega components which play different roles and are manipulated differently by the nightingales. In addition, the birds learn from a Master sequence (60 songs) of songs (which imposes an ordering on the songs) and then reinforce each song through repetitions (not necessarily in the string order of the Master sequence). However, the way the birds generate the songs subsequently reveals chunking where temporally proximal songs are grouped into chunks of 3-6 (packages). Depending on the context (i.e. feeding, adverserial, etc.) structures in the transition matrix between songs can be seen and different packagings exist. The packaging type reveals an intential communication by the birds that depends on context. In addition, the chunking in the transition matrix reveals a hiearchicial stage of super-states that transition between packages. Curiously, this is very similar to the hierarchical HMM Brian has been using.

Brian's slides are available at:
HTML form:
Powerpoint form:

02/29/00 @5:00pm "Various Views on Classical and Operant Conditioning" with Yuri Ivanov Yuri Ivanov presented an overview of classical and operant conditioning and different perspectives and computational approaches. A number of phenomena were outlined, including blocking and Selectionist views were described, as well as various models of conditioning and selection. These included the Rescorla-Wagner model, the Sutton-Barto model, Temporal Differencing, statistical models, neural network models and other reinforcement learning kinds of models. Issues of temporal representation, discretization, the ability to obtain insight into what the model was doing, etc. were brought up. To see the slides online, visit:.

Yuri, Bruce, Irene and others discussed how to represent time; if one should merely ignore intervals as in the Rescorla-Wagner model. Bruce Blumberg suggested that various timing issues must be considered to reflect how behavior is learned and unlearned in real animals. As opposed to purely associative learning frameworks, there is a necessity for temporal modeling (i.e. dependence on temporal intervals as well as rates of event occurrence). He pointed out a relevant paper that is forthcoming in Psychological Review this April called "Time, Rate and Conditioning" by C. R. Gallistel and John Gibbon .

02/22/00 @5:00pm "Affective Synthetic Characters" with Song-Yee Yoon Song-Yee Yoon gave a preview of her PhD defense talk that will be held on February 23rd. Her PhD is a joint one with the MIT Media Lab (with Bruce Blumberg) and Brain and Cognitive Sciences. Song-Yee presented a creature kernel for generating synthetic behavior as well as learning with reinforcement. Some ideas included hierarchical feature selection and hierarchical probabilistic behavior systems.

The applications of her work were in (void *), a cast of characters. Here, multiple characters respond to dancing control from tracking in a bread-fork type of joystick controller. The behaviors of 3 different types of characters were protrayed which had different biases for their choice of actions and their responses to the user. A study showed that users could discern the intended personalities of the characters from their visual interaction and behavior style. Another application was K9.0, a synthetic dog character which is trained using clicker training in a virtual world. The dog is given auditory commands (using speech recognition) and then reinforced with a clicker sound and food. This interface to its virtual world allows it to update its behavior probabilities and adapt to the auditory commands.

02/15/00 @5:00pm "Plasticity of Rewired Neocortical Circuits" with Tony Jebara A paper was presented this meeting, namely: "Rewiring Cortex: The Role of Patterned Activity in Development and Plasticity of Neocortical Circuits", M. Sur, A. Angelucci and J. Sharma Journal of Neurobiology, 41: 33-43, October 1999. The experiments involved deflecting the optic pathways in ferets to the A1 (audio) region of the cortex instead of their usual V1 (visual) region. Subsequently, the structures (stripes, hypercolumns, orientation sensitive cells, occular dominance patterns, etc.) begin to develop in the A1 region which adapts to the optic types of signals. Thus, the innate audio structures are changed and develop as visual centers.

This paper has implications as far as computational learning is concerned. How do we achieve such flexibility in our learning systems? Most speech recognizers are very different in spirit and implementation from computer vision techniques. How can the brain reuse the machinery it has to make audio units process video? Evidently, instead of being a task-oriented engineered system the brain must have some higher order self-organization principles that are responsible for this adaptive power. What conclusions can be drawn about the way we implement and engineer learning systems?

02/08/00 @5:00pm "Behavior Course and Some General Ideas" with Behavior Workgroup In this meeting, Bruce Blumberg and Irene Pepperberg discussed the new course they are offering this term at the Media Lab (MAS-965 which meets Thursdays at 1pm in E15-335). An overview of the course itinerary and some ideas were discussed.

We then had a general talk about the developmental workshop and the issues that were up for discussion there ( Imitation learning, in machines and animals was mentioned and some results from ethology (including imitation in parrots and apes) were mentionned. Some of the computational implementations included the action-reaction learning paradigm (( In addition, we noted the recent neuroscience results on imitation learning with fMRI measurements to confirm the activity in human beings (see: "Cortical Mechanisms of Human Imitiation" by Marco Iacoboni, Roger P. Woods, Marcel Brass, Harold Bekkering, John C. Mazziotta, Giacomo Rizzolatti. In Science, Vol 286, December 1999.).

01/20/00 @4:00pm "Situational Awareness and the Facilitator Room" with Pentlandians Folks In this meeting, a few agenda items were discussed. First, Kevin Davis from facilities was present to formalize changes to the picture tel room to make it into an augmented "Facilitator Room" with full teleconferencing, sensors and output projectors. Layout, furniture, architecture, sensors, carpeting etc. as well as mounting issues were discussed (i.e. AD20 ceiling mounts for easy installation of equipment). In addition, sound proofing and other such issues were brought up. The room will have 3 couches surrounding a small table which will be projected upon with an overhead projector. In addition, smart white boards, cameras and microphones will be placed to track the activities of the users.

Subsequently, 3 papers were presented that are related to the above effort and that were to be submitted to the CHI (Conference on Computer-Human Interaction) workshop entitled: "Situated Interaction in Ubiquitous Computing". The 3 projects that were described were:

i) "Memory Glasses: Wearable Audio-Visual Event Tagging" with Brian Clarkson and Richard DeVaul

ii) "The Facilitator: An Experiment in Computer-Based Mediation"
with Sumit Basu

iii) "Conversational Context Learning for Machine Augmentation of Human Discourse" with Tony Jebara, Yuri Ivanov and Ali Rahimi

Essentially, all projects deal with recovering audio visual information. For instance, i) deals with ambient audio and video, ii) deals with speaker identification and iii) deals with word/topic modeling and face detection. These sensing modalities allow the systems to track the current context and give feedback to the user(s) in real-time. For instance, these serve as remembrance aids, or controlling how much a person talks in a meeting or encouraging discussions in meetings by giving conversational cues.

12/16/99 @1:30pm "Synthetic Characters" with Bruce Blumberg Bruce presented his current work with the synthetic characters group. Special focus was on dogs and ethology. They only have lemon-sized brains yet we love them and they can interact with us and the world quite skillfully. Current projects such as Void *, Squish, Sydney K9.0, etc. were showcased.

In Void *, puppets learn and display attitudes and personalities which change dynamically. Sydney K9.0 has an integrated model of emotion, motivation and motor control. In addition, clicker training was discussed where a fast event before a reward (i.e. food) is used to reinforce desired behavior and also shows segments temporal the end of the correct behavior. In real animals a vocalisation such as "too bad" can be used for non-reward. The systems demonstrated also utilize a "lure" which is a training stick that the virtual character tracks to shape the behavior and encourage rolling over or sitting so that reinforcement learning is accelerated.

As Birks points out, dogs listen and sense things quite differently. Olfaction is critical and skin flakes are more easily detect than nearby immobile objects (especially in certain types of dogs, terriers or labradors). Finally, Bruce demonstrated the Sony Aibo dog, a robotic toy with various actuators and simple sensors and discussed the importance of toys that learn.

11/24/99 @2pm "Discriminative and Conditional Learning" with Tony Jebara Tony presented current work with Tommi Jaakkola and Marina Meila on using more effective discriminative criteria (i.e. maximum margin) in machine learning while maintaining the richness of the models that currently used in the Bayesian / ML community.

The maximum entropy formalism is extended to allow for discriminative criteria and consequently permits maximum margin estimation of exponential family distributions. The constraints are satisfied while permitting the use of priors on the parameters, prior margin distribution and priors on missing labels. The formalism also permits anomaly detection and the estimation of Bayes net structures instead of just parameters.

Some related work on the CEM algorithm which is a discriminative version of the EM algorithm was also discussed.

11/11/99 @4pm "Visual Models of Interaction" with David Hogg David presented his latest work on recovering and synthesizing interactive visual behavior. The modalities span pedestrian walking behavior, face animation and body contour animation. Applications include: Anomaly detection, Path prediction, Occlusion reasoning, and Virtual Interaction.

Techniques that were explored initially were leaky neural networks but these were replaced by a more elegant condensation based method (using the work of Isaard and Blake). This allows the method to cope with tracking uncertainties. The method learns the probability p(delta into future | past) using a 100 Gaussian mixture mode with EM. A video corpus of 5 hours of pedestrian behavior was used.

Animations can be simulated by sampling p(delta into future | past) using a 10 second history typically. A VRML animated avatar was shown walking along the paths on the street. In addition, animated chess pieces walk through David's living room model following the path of people walking in the house.

For face tracking, the University of Manchester face tracker is used. It uses eigenfaces and control points (80 parameters total). David is working on feeding in the behavior predictions into the tracker.

Chris Wren noted that condensation requires sampling many points to avoid the limitations of standard Kalman filtering. Others (i.e. INRIA) propagate mixtures and do multiple hypothesis testing. There could be a continuum between the techniques.

Future work is using Mixtures in a Markov chain.
11/4/99 @3:45pm "Parrot Behavior" by Irene Pepperberg For a summary of the talk, please see the "Talks" link.

Irene discussed the interactive parrot toys project. Parrots will display unhealthy behavior (plucking their feathers, throwing fits, etc.) if they do not get interaction from their owners.

Possible modalities:
      Computerized teaching of the parrot
      Touch screens
      RF tags on objects (colored keys, toys, etc.)
      Audio recognition
      Parrots don't pay attention to screens, may be due to 60Hz refresh rate.

Reinforcement is difficult with food since parrots need to be starved for food to be a reward and this leads to anorexic behavior and does not work as well as with other birds.
10/28/99 @4pm "Driving Behavior" by Betty Lou Mcclanahan The following new people joined the meeting:
*Manned Vehicle Lab Folks:
Chuck Oman (TCASS)
      Air traffic control
      Training several sub-tasks vs. holistic -> trainee develops bad habits
Andy Liu (space station)
Scott Rasmussen (modeling & predicting helicopter flight)
      higher order goals manoeuvers (roll, acrobatics, barrel)
      training pilots, using HMMs
*Media Lab Folks:
Betty Lou Mcclanahan (theory & racing)
      Display in car is not helpful for training & performance
      Display in rear view mirror for focus at infinity
David Hogg (tracking, interaction)

Betty Lou Mclanahan presented her work on driving emphasis on vehicle control (VC).
Sandy Pentland discussed
      Interactive systems,
      Intentionality by Grice (philosopher) and 0,1,2nd order Intentionality
      0-take your cup
      1-take your cup, know you get mad
      2-lying, take your cup, trick you to drink
      equivalently, modeling a driver with different order models:
      0-car sees driver as steering wheel
      1-car sees driver as state machine
      2-car sees state machine+ideal driver
            e.g. Reinforcement to make you look where you drive
            e.g. Reinforcement case where lecturer goes to right after audience feedback
09/30/99 @4pm "Developmental Machines" by Juyang Weng For a summary of the talk, please see the "Abstracts" link.

Lee Campbell noted that some aspects of the unsupervised learning where actually supervised in that a teacher needs to be present to show the robot (i.e. SAIL) what actions are performed in response to the sensed input. However, the emphasis is that this task is not programmed into the machine a priori by a supervising programmer but rather that the supervision comes in only from natural interaction with a real-world teacher.