7.2 Biggest Lessons

The most important lesson I learned from this project is that humans, both trained and untrained, skilled and unskilled, are able to internalize a great deal of expressive information without consciously being aware of it. For example, when I would watch the videotape of one of my conductor subjects uncritically, I would not know how to describe the information in the gestures. However, when slowed down to the level of the individual video frame compared with the high-resolution physiological data, the structure would often become clear. The moment of a beat, not always obvious from the videotape, is immediately obvious from the right biceps EMG signal. The amount of emphasis in the beat, not always proportional to its velocity, is nonetheless clear in the muscle tension signal. Some people have questioned whether or not it is possible to "see" the physiological effects that I have described in this thesis, particularly muscle tension. I agree that there is an open question as to whether or not the musicians in an orchestra are able to perceive and respond to physiological changes of the conductor. However, based on the findings of this study, I propose that people are, indeed, naturally sensitive to small changes in muscle tension. The tensing of a muscle and the resultant effects on things such as the force and momentum of an arm are visually perceivable. However, these aspects are very difficult to quantify or express in language, because they involve complex changes that happen continually at very small time intervals. The physiological sensors are able to capture this information and express it as a signal; scrutinizing this data out of real-time gives us increased insight into the structure in a way that purely visual observations could never do.

Secondly, of the six different types of signals that were collected from our conductor subjects (EMG, Respiration, GSR, Temperature, Heart Rate, Position), it appeared that the most significant results came from the volitional signals. That is, the signals which are under purposeful control (and which the subject is naturally aware of) tend to be the ones with the greatest information content. Physiological signals that are not volitional, such as GSR, temperature, and heart rate, did not consistently correlate with the music. The respiration signals have an extremely interesting and complex relationship to the music, but remain challenging to write filters for, since they seem to have a complex relationship to the music. The features in the EMG signals tended to be much more meaningful, and therefore, real-time systems in the near future will be able to make the greatest use of the EMG sensors.

Finally, I’ve learned that the intuitiveness and naturalness of a mapping has mostly to do with the audience’s perception, and not with the performer’s ease of use. That is, a performer can quickly train to a system, even a difficult system, to the point where the mapping and gestures feel normal and natural. But it is the audience that ultimately decides if a mapping ‘works’ or not. If the audience is confused about the relationship between the gesture and the music, then the mapping does not work. If people feel a cognitive dissonance between their expectations and the response, then the mapping does not work. If they distrust whether or not the performer is miming to the sound of the music, then the mapping does not work. However, the answer is not always to ‘prove’ that the performer is indeed controlling the sound. For example, in my Gesture Construction software, I purposefully made it ‘direct-drive’ on the beat level so that when I stopped beating, the music stopped playing. This, I felt, would convince people of the connection between my gesture and the sound. However, it is profoundly unmusical to stop a piece in the middle just to show that you can. I never quite found a way to prove that I was controlling the beats in a musical way; this was also confounded by the fact that my beat-tracker was too sensitive and would double-trigger frequently. The whole effect can quickly deteriorate into a comedic battle with an invisible enemy, between losing and asserting control over a system that at times is perfectly synchronized with the gestures and at other times completely off the mark.

  Chapter 7.3