Rhythmic Similarity
12:44 AM 3/15/2007, by Shi Yong
Annotated Bibliography
Foote, J. and S. Uchihashi. 2001. The Beat Spectrum: A New Approach to Rhythm Analysis. In Proceedings of the International Conference on Multimedia and Expo.
This paper present two measures, the beat spectrum and beat spectrogram that can characterize the rhythm and tempo of music. Feature vectors are extracted from overlapped windows, and then the 2-D similarity matrix is constructed by the all pairwise combinations of feature vectors. The beat spectrum is derived by computing the diagonal sums or autocorrelation of similarity matrix. The beat spectrogram is used to visualize the variation of beat spectrum over time. The rhythmic similarity can be measured by comparing the beat spectra.
Foote, J., M. Cooper and U. Nam. 2002. Audio Retrieval by Rhythmic Similarity. In Proceedings of the 3rd International Symposium on Musical Information Retrieval.
This paper is based on the paper of the same author in 2002. It presents ways to measure the rhythmic similarity quantitatively. The beat spectra, as 1-dimensional functions of lag time, are truncated to L-dimensional vectors, and then distance can be measured between any two beat spectra. Three different distance functions are evaluated, namely Euclidean Distance, Cosine Distance, and Fourier Beat Spectral Coefficients. The result shows the latter two functions outperform the simple Euclidean Distance.
Paulus, J., and A. Klapuri. 2002. Measuring the Similarity of Rhythmic Patterns. In Proceedings of the 3rd International Symposium on Musical Information Retrieval.
This article presented an automatic rhythmic pattern segmentation and rhythmic similarity measuring system. Pattern was estimated at tatum, tactus, and musical measure levels. For each pattern, acoustic features are extracted from a series of consecutive time frames. Three features, namely loudness, spectral centroid, and MFCCs were tested. Dynamic Time Warping was used to match feature vectors with different lengths, and a similarity measure was obtained by comparing feature vectors of two patterns. The result shows the spectral centroid weighted by loudness is the best performing feature. A 67% correct rate of 365 pieces for tactus periods and a 77% correct rate of 141 pieces for musical measure lengths were reported.
Dixon, S., F. Gouyon, and G.Widmer. 2004. Towards Characterisation of Music via Rhythmic Patterns. In Proceedings of the International Conference on Music Information Retrieval.
In this article, Dixon presented a feature based on temporal energy, the Typical Bar-length Rhythmic Pattern, and its usefulness in genre classification task for ballroom dance music. It was reported that 50% classification rate (the baseline is 16%) was achieved by using the rhythmic pattern feature alone, 84% by using together with other automatically computed features (derived from rhythmic patterns or audio data), and 96% by using measured tempo.