TR#492: Towards music understanding without separation: Segmenting music with correlogram comodulation

Eric D. Scheirer

Submitted to 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Mohonk, NY

The application of a new technique for sound-scene analysis to the segmentation of complex musical signals is presented. This technique operates by discovering common modulation behavior among groups of frequency subbands in the autocorrelogram domain. The algorithm can be demonstrated to locate perceptual events in time and frequency when it is executed on ecological music examples taken directly from compact disc recordings. It operates within a strict probabilistic framework, which makes it convenient to incorporate into a larger signal-understanding test-bed. Only within-channel dynamic signal behavior is used to locate events; therefore, the model stands as a theoretical alternative to methods that use pitch as their primary grouping cue. This segmentation algorithm is one processing element to be included in the construction of music perception systems that understand sound without attempting to separate it into components.