Annotated Bibliography

An overview of monophonic pitch detection algorithms


Music related papers


Cheveigne A. and Kawahara H., "YIN, a fundamental frequency estimator for speech and music ", The Journal of the Acoustical Society of America, Volume 111, Issue 4, pp. 1917-30, 2002.

This paper presents an algorithm for the estimation of the fundamental frequency of speech or musical sounds. It is based on the well-known autocorrelation method with a number of modifications that combine to prevent errors.

Cuadra P., Master A. and Sapp C., "Efficient Pitch Detection Techniques for Interactive Music" Proceedings of the International Computer Music Conference, 2001.

Several pitch detection algorithms are examined for use in interactive computer-music performance. The authors define criterias to criticise four techniques: Harmonic Product Spectrum, Cepstrum-Biased HPS, Maximum Likelihood and the Weighted Autocorrelation Function.

Fitch J. and Shabana W., "A wavelet-based pitch detector for musical signals", Proceedings of the Workshop on Digital Audio Effects (DAFx), 1999.

In this paper, an algorithm based on the Dyadic Wavelet Transform has been investigated for pitch detection of musical signals.

Tadokoro Y., Matsumoto W. and Yamaguchi M., "Pitch detection of musical sounds using adaptive comb filters controlled by time delay", Proceedings of the international Conference on Multimedia and Expo, p.109-12, 2002.

This paper proposes a new method for the pitch detection using adaptive comb filters by controlling the number of delay elements. Using three adaptive comb filters we can detect the pitches of triples tones.

Doval B. and Rodet X., "Fundamental frequency estimation and tracking using maximum likelihood harmonic matching and HMMs" Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 1993.

In this paper, A new approach is presented for the estimation and tracking of the fundamental frequency of pseudoperiodic signals. It is based on a probabilistic model of pseudoperiodic signals that makes it possible to take prior knowledge into account and to include constraints on the evolution of the signal.

Dziubinski M. and Kostek B., "High accuracy and octave error immune pitch detection algorithm", ARCHIVES OF ACOUSTICS, 2004.

The aim of this paper is to present a method improving pitch estimation accuracy, showing high performance for both synthetic harmonic signals and musical instrument sounds.This method employs an Artificial Neural Network of a feed-forward type.

Klapuri A., "Pitch Estimation Using Multiple Independent time-frequency windows", Proceedings of the Workshop on Applications of Signal Processing to Audio and Acoustics, p.115-8, 1999.

This paper proposes a system built upon a pitch model that calculates independent pitch estimates in separate time-frequency windows and then combines them to yield a single estimate of the pitch.

Godsill S. and Davy M., "Bayesian harmonic models for musical pitch estimation and analysis" Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 2202.

Developments to an earlier Bayesian model which describes each component signal in terms of fundamental frequency, partials and amplitude is proposed in this paper. This basic model is modified for greater realism to include non-white residual spectrum, time-varying amplitudes and partials ‘detuned’ from the natural linear relationship.

Marchand S., "An efficient pitch-tracking algorithm using a combination of fourrier transforms", Proceedings of the Conference on Digital Audio Effects (DAFX), 2001.

In this paper, a technique for detecting the pitch of sound using a series of two forward Fourier transforms is presented. We use an enhanced version of the Fourier transform for a better accuracy, as well as a tracking strategy among pitch candidates for an increased robustness.

Marchand S., "Musical pitch tracking using internal model control based frequency cancellation", Proceedings of the Conference on Decision and Control, 2003.

A new method for pitch estimation of murid sound signal is presented in this paper. This method is based on the behavior of a notch filter in an error feedback system, and was first developed for identification of periodic signal with uncertain frequency.

Chou W. and Gu L., "Robust singing detection in speech/music discriminator design", IEEE International conference on acoustic, speech and signal processing, 2001.

This paper proposes an approach for robust signing signal detection in speech/music discrimination is proposed and applied to applications of audio indexing.



Others interesting papers mostly related to speech


Chisaki Y., Usagawa T. and Ebata M., "Improvement of pitch estimation using harmonic wavelet transform", Proceedings of the IEEE Region 10 Conference, p. 601-5, 1999.

A pitch detection for harmonic signal based on wavelet transform is proposed in this paper. Moreover, it is examined how errors of frequency and phase for analyzing wavelet against observed signal affect to the accuracy of pitch estimation.

Pollastri E., "Melody-Retrieval based on Pitch-Tracking and String-Matching Methods", Proceedings of the XIIth Colloquium on Musical Informatics, 1999.

In this paper, a pitch-tracking system based on RMS-power segmentation and harmonic gathering is developed. This software can measure pitches from monophonic sources in the range of 50 Hz to 20 KHz.

Li X., Malkin J. and Bilmes J., "Graphical model approach to pitch tracking", Proceedings of the International Conference on Spoken Language Processing, 2004.

This work presents a graphical model framework to automatically optimize pitch tracking parameters n the maximum likelihood sense. Therein, probabilistic dependencies between pitch, pitch transition and acoustical obser vations are expressed using the language of graphical models and probabilistic inference is accomplished using the Graphical Model Toolkit (GMTK).

Slaney M. and Lyon F., "A Perceptual Pitch Detector", Proceedings of the International Conference on Acoustics Speech and Signal Processing, 1990.

This perceptual pitch detector combines a cochlear model with a bank of autocorrelators. By performing an independent autocorrelation for each channel, the pitch detector is relatively insensitive to phase changes across channels.

Quast H., Schreiner O. and Schroeder M., "Robust pitch tracking in the car environment", Proceedings of the International Conference on Acoustics Speech and Signal Processing, 2002.

This paper compares four different pitch tracking lagorithms: autocorrelation, cepstrum, harmonic product spectrum, and a new method based on the modulation spectrum.

Picone J., Doddington G. and Secrest B., "Robust pitch in a noisy telephone environment", Proceedings of the International Conference on Acoustics Speech and Signal Processing, 1987.

This paper compares four different pitch tracking lagorithms: autocorrelation, cepstrum, harmonic product spectrum, and a new method based on the modulation spectrum.

Picone J., Doddington G. and Secrest B., "Robust pitch in a noisy telephone environment", Proceedings of the International Conference on Acoustics Speech and Signal Processing, 1987.

In this paper, three pitch detection algorithms are evaluated over a database consisting of speech. The three algorithms evaluated are: an improved version of the Integrated Correlation pitch tracker, the Gold-Rabiner parallel processing algorithm, and the NSA LPc-lo DYPTRACK version 43 algorithm.

Li B., Li Y., Wang C., Tang C. and Zhang E., "A New Efficient Pitch-Tracking Algorithm", Proceedings of the Intemational Conference on Robotics Intelligent Systems and Signal Processing, 2003.

Pitch Harmonical Autocowelation (PHA) is used for the initial pitch detection. Then we use the autocorrelation of the speech frequency to refine the pitch.

Deriche N., "A novel pitch estimation technique using the teager energy function", Proceedings of the Fifth International Symposium on Signal Processing and its Applications, 1999.

This paper proposes a new method to estimate pitch based on the Teager Energy Function (TEF). The method is based on the peak detection of the TEF for each frame.