Use of neural nets to perform singer identification. Mainly interesting because it quantatively showed that distinguishing between vocal and non vocal parts before performing singer similarity study would increase performances of classification. |
Very clear paper. Interesting in mainly two points. First, a pre-processing stage is added to the vocal/non-vocal detection (done with GMM and a hypothesis test): extraction of the harmonic components present in the sound, based on Goto's PreFEST algorithm for bass line extraction. This harmonic content is then resynthesized by addditive synthesis. Second, an evaluation of different features is done in order to determine the best feature choice. The classification of singers is done using GMMs. |
Main interest resides in the attempt to use a measure of the harmonicity of the signal to separate vocal/non-vocal portions of the sound. Nevertheless, the results obtained for singer classification (using GMM and SVM) are very low, possibly due to the low performances of the vocal/non-vocal segmentation mechanism. |
An interesting work which is based on phoneme recognition using k-NNs. This is a pretty different approach from that presented in the rest of the papers: no features used for singer classification like pitch, harmonicity etc. and then fed into a GMM. It performed quite well: 80% recognition. It is an illustration of the potential of k-NNs. |
Not directly concerned with the identification process but rather with the feature extraction part. In particular, it explains why and how MFCCs have been used in speaker recognition and singer recognition. |
This article presents a classification method based on a model of the solo voice. It assumes that the instrumental portions of a song and the accompaniement of the singer are very similar. Thus, after discriminating between vocal and non-vocal segments of a song (using GMMs), a model of the voice (GMM) is derived from an a priori model of the accompaniement (GMM). The study of singer tracking (inside a song) is also considered. |
Very clear text on which I based my work for this presentation. It gives an overview of the past research on singer identification and sets clear directions of the field. It presents the same method as presented in Tsai and Huang (2004). I would recommend this article as a starting point for someone interested in singer identification. |
Earlier work by Tsai et al. that first introduces the idea of deducing a model of the voice from that of the background. See Tsai and Huang (2006) for a more complete presentation. |
Another example of the use of GMMS for singer classification. A main difference resides in the way the vocal/non-vocal distinction is made: the start of the singer's voice is detected using different features (ZCR, Spectral Flux, Hamonicity measure etc.) and then a fixed length of the song is extracted and studied. In other words, only the first words of the song are studied to perform recognition (using GMMs). |