MUMT 621: Music Information Acquisition, Preservation, and Retrieval [course page]
28 February 2012 :: Slide-based Presentation II
Hannah Robertson [home page]
Gaussian Mixed Models
- Heittola, Klapuri, and Virtanen. 2009. "Musical instrument recognition in polyphonic audio using source-filter model for sound separation." Proceedings of the 10th International Conference on Music Information Retrieval.
In this paper, GMMs are used to model the densities of extracted MFCC features from polyphonic audio samples. These modeled densities were then used to predict what combination of instruments created the sound. Instruments used were accordion, bassoon, clarinet, contrabass, electric bass, electric guitar, electric piano, flute, guitar, harmonica, horn, oboe, piano piccolo, recorder, saxophone, trombone, trumpet, and tuba. Different polyphonic signal conditions were used for the test data, although the system was trained with separated monophonic instrument samples.
- Jensen, Ellis, Christensen, and Jensen. 2007. "Evaluation of distance measures between Gaussian mixture models of MFCCs." Proceedings of the 8th International Conference on Music Information Retrieval.
This paper compares various ways to measure the distance between genres classified using a GMM. A useful paper for those with an understanding of GMMs and their applications in MIR to genre/MFCC clustering.
- Marolt. 2004. "Gaussian mixture models for extraction of melodic lines from audio recordings." Proceedings of the 5th International Conference on Music Information Retrieval.
In this paper, Gaussian Mixture Models are twice applied in order to find and then group melodic lines from audio recordings. First, EM is used to find all melodic fragments in a recording, by looking for regions with strong and stable pitch. The dominance, pitch, loudness, pitch stability, and onset steepness of these fragments is then used to cluster them by source: vocal, backup-vocal, bass, noise, etc. Their results showed success at grouping the lead melodic line, but less so the lesser lines (including noise).
- Marques and Moreno. 1999. "A Study of Musical Instrument Classification Using Gaussian Mixture Models and Support Vector Machines." Cambridge Research Library.
In this paper, GMM algorithms are used to classify and determine instrument source from 0.2 second solo sound segments. The instruments classified were bagpipes, clarinet, flute, harpsichord, organ, piano, trombone, and violin. The classifiers had a 70% success rate, The ultimate goal of this work was to automatically annotate and search files that include sound (audio and video).
- Moore, Andrew. 2012. "Gaussian Mixture Models (Tutorial Slides)." Accessed February 2012. http://www.autonlab.org/tutorials/gmm.html.
[ link ]
This set of slides by Andrew Moore provides a tutorial introduction to the concepts of clustering, clustering using Gaussian mixtures, and the Expectation Maximization optimization method. While there are a fair number of equations thrown about, some of the images help clarify how clustering works.