Audio Segmentation

6:10 PM 3/1/2007, by Shi Yong

Annotated Bibliography


  • Aarts, R. M., and R. T. Dekkers. 1999. A real-time speech-music discriminator. J. Audio Eng. Soc., 47 (9):720-5.
  • Alexandre, E., M. Rosa, L. Cuadra, and R. Gil-Pita. 2006. Application of Fisher Linear Discriminant Analysis to Speech/Music Classification. Paper read at the 120th Convention of Audio Engineering Society, at Paris, France.
  • Brown, J. C. 1999. Computer identification of musical instruments using pattern recognition with cepstral coefficients as features, J. Acoust. Soc. AM.
  • Chen, S.S., and P.S. Gopalakrishnan. 1998. Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion: IBM T.J. Watson Research Center.
  • Huang, R., and J.H.L. Hansen. 2004. Unsupervised Audio Segmentation and Classification for Robust Spoken Document Retrieval. Paper read at IEEE ICASSP-2004: Inter. Conf. on Acoustics, Speech, and Signal Processing.
  • Kemp, T., M. Schmidt, M. Westphal, and A. Waibel. 2000. Strategies for automatic segmentation of audio data. Paper read at IEEE International Conference on Acoustics, Speech, and Signal Processing.
  • Omar, A.H. 2005. Audio Segmentation and Classification, Technical University of Denmark.
  • Saunders, J. 1996. Real-time discrimination of broadcast speech/music. Paper read at IEEE International Conference on Acoustics, Speech, and Signal Processing, at Atlanta, GA, USA.
  • Tritschler, A., and R. Gopinath. 1999. Improved Speaker Segmentation and Segments Clustering Using the Bayesian Information Criterion: IBM T.J. Watson Research Center.
  • Tzanetakis, G., and P. Cook. 1999. Multifeature Audio Segmentation for Browsing and Annotation. Paper read at IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct. 17-20, 1999, at New Paltz, New York.
  • Tzanetakis, G., and P. Cook. 2002. Musical Genre Classification of Audio Signals. Paper read at IEEE Transactions on Speech and Audio Processing July 2002.
  • Wang, W.Q., W. Gao, and D.W. Ying. 2003. A fast and robust speech/music discrimination approach. Paper read at Proceedings of the 2003 Joint Conference of the Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia.
  • On-line Resources


  • K-Means Clustering Algorithm [accessed 2007 March 1]
  • A Tutorial on Clustering Algorithms [accessed 2007 March 1]
  • Statistical Data Mining Tutorial [accessed 2007 March 1]