Pachet and Aucouturier’s Timbre
Similarity Work
Annotated Bibliography with Hyperlinks
Rebecca Fiebrink
Created 5 March 2005
Aucouturier, J., and F. Pachet. 2002a. Finding songs that sound the same. Proceedings of the First IEEE Benelux Workshop on Model Based Processing and Coding of Audio, 91–98.
Both 2002 papers outline the
initial timbre similarity measure and its implementation in the context
of CUIDADO. The authors make a case for global timbre similarity as a
useful measure and distinguish this work from other research. The basic
algorithm is explained: a set of Mel Frequency Cepstral Coefficients (MFCCs)
is computed for each frame of the music and modeled using Gaussian
Mixture Models. Similarity between the GMMs for two songs is computed
using sampling. These two papers evaluate the results positively: the
system rates songs by the same artist and within the same genre as
generally timbrally similar. Other matches made by the system, such as
Beethoven and The Beatles, are classified as “interesting,” in that they reveal aspects of
similarity that are not available from metadata. They demonstrate the
system’s usefulness in “nearest-neighbor” song searching and in playlist
generation. This paper is a recommended read for anyone interested in
starting to learn about timbre similarity.
Available online (accessed 5 March 2005): http://www.csl.sony.fr/downloads/papers/uploads/aucouturier-02c.pdf |
———. 2002b. Music similarity measures: What’s the use? Proceedings of the International Conference on Music Information Retrieval.
This paper describes the same
initial timbre similarity implementation as 2000a. More details are
provided regarding objective system evaluation. Furthermore, the concept
of “interestingness” is better defined, with stress on the
importance of implementing a measure that can lead users to an
experience of “aha!” I found the authors’ quantification of “aha!” a bit odd, but the discussion
regarding giving users control over the degree of experimentation they
desire from a system seemed to be quite relevant.
Available online (accessed 4 March 2005): http://www.csl.sony.fr/downloads/papers/uploads/pachet-02g.pdf |
———. 2004a. Improving timbre similarity: How high’s the sky? Journal of Negative Results in Speech and Audio Sciences 1 (1).
This paper offers an in-depth
examination of many attempts to improve on the 2002 system. The effects
of changing parameters such as the audio sample rate, number of MFCCs,
number of GMM components, distance sample rate, and window size are
discussed, and optimal parameter values are chosen. Earth Mover’s
Distance is examined as an alternative to sampling. A variety of
front-end processing techniques are explored. Some improvement was
made, but the results suggest a “ceiling” on possible accuracy of timbre
similarity measures using the given framework.
This paper presented a very good discussion of the difficulties inherent in testing similarity systems, particularly in the absence of ground truth. I felt like this offered a more honest appraisal of the system’s capabilities than the 2002 papers. The authors provide meticulous discussion of the details of their work. This is a useful paper as a follow-up from 2002a and a must-read for anyone hoping to implement improved timbre similarity measures. Available online (accessed 4 March 2005): http://www.csl.sony.fr/downloads/papers/uploads/aucouturier-04b.pdf |
———. 2004b. Tools and architecture for the evaluation of similarity measures: Case study of timbre similarity. Proceedings of the International Conference on Music Information Retrieval.
This paper presents the same
work as 2004a. The focus is somewhat more on using the Music Browser
system as a tool. This paper is recommended over 2004a for anyone
interested in building music browsing systems, while 2004a is
recommended for those more interested in technical details of timbre
similarity measurement.
Available online (accessed 4 March 2005): http://www.csl.sony.fr/downloads/papers/2004/aucouturier-04c.pdf |
Pachet, F., A. La Burthe, A. Zils, and J. Aucouturier. 2004. Popular music access: The Sony music browser. Journal of the American Society for Information Science 55 (12): 1037–44.
This paper discusses the Sony
Music Browser, a project developed within CUIDADO. It discusses
electronic music distribution in general and the importance of
integrating information about high-level musical percepts with existing
metadata and collaborative filtering techniques. This is a very good
read for anyone interested in online music access, as it hits on issues
relating to both technology and human factors.
Available online (accessed 4 March 2005): http://www.csl.sony.fr/downloads/papers/uploads/pachet-02a.pdf |
Pampalk, E., S. Dixon, and G. Widmer. 2003. On the evaluation of perceptual similarity measures for music. Proceedings of the 6th International Conference on Digital Audio Effects, 6-12.
This paper reviews five music
similarity measures (not necessarily timbre-specific), of which
Aucouturier and Pachet’s 2002 system is one. The authors note
that this system performs relatively slowly though otherwise relatively
well. Interestingly, the authors point out that the 2002 system
incorporates dynamic level into similarity judgments through its use of
the first MFCC. This paper doesn’t offer any significant new insights
into Aucouturier and Pachet’s work, but it is helpful to situate
their work among similar studies.
Available
online (accessed 4 March 2005): |
This paper offers an overview of
the European CUIDADO project, which stands for Content-based Unified
Interfaces and Descriptors for Audio/music Databases Available Online.
The Music Browser used by Aucouturier and Pachet is one component of
CUIDADO. This is good background reading for anyone wanting to learn
more about the Music Browser or the Sound Palette.
Available online (accessed 4 March 2005): http://www.csl.sony.fr/downloads/papers/2002/pachet02h.pdf |
Go back to my main page or my MUMT 611 page...