Annotated Bibliography

MPEG-4 | Voice Separation | Fingerprinting | Watermarking

:. MPEG-4 .:

References

Brandenburg, K. 1999. MP3 and ACC explained. Proceedings of the Audio Engineering Society Conference on High Quality Audio Coding.

This paper provides an overview of MP3 and ACC audio compression/decompression algorithms as well as a brief discussion of audio compression quality issues.

Pereira, F., and T. Ebrahimi. 2002. The MPEG-4 book. Upper Saddle River, NJ: Prentice-Hall.

A comprehensive guide to the MPEG-4 standard. Written by two leaders of the MPEG-4 community.

International Organisation for Standardisation. 2002. Overview of the MPEG-4 standard, V.21 (Jeju Version).

Overview of the MPEG-4 standard. Self-explicit.

Available online at:
http://www.chiariglione.org/mpeg/standards/mpeg-4/mpeg-4.htm

International Organisation for Standardisation. 1999. MPEG-4 applications.

Provides a list of applications that are suitable to MPEG-4. Describes the applications in terms of their requirements and illustrates how MPEG-4 can be used in the context.

Available online at:
http://www.chiariglione.org/mpeg/working_documents/mpeg-04/requirements/applications.zip

International Organisation for Standardisation. 2003. MPEG-4 requirements, V.18 (Trondheim revision).

Requirements document of the MPEG-4 standard. Mostly useful to somebody interested in implementing the standard.

Available online at:
http://www.chiariglione.org/mpeg/working_documents/mpeg-04/requirements/requirements.zip

MPEG-4 Industry Forum. 2002. MPEG-4: The media standard.

This document presents an overview of the MPEG-4 technology and its benefits. It is essentially a marketing document.

Available online at:
http://www.m4if.org/public/documents/vault/m4-out-20027.pdf

Links

The MPEG Home Page - http://www.chiariglione.org/mpeg/

The official MPEG website. Contains lots of information and standards documentation. A great place to start learning about MPEG-4.

The MPEG Industry Forum - http://www.m4if.org/

As stated on the site itself, the goal is: "To further the adoption of MPEG Standards, by establishing them as well accepted and widely used standards among creators of content, developers, manufacturers, providers of services, and end users." Thus, this site is about promoting the MPEG standards. It contains news about the MPEG community, many documents and other resources. Great place to start looking for information about MPEG-4.

MPEG Pointers and Resources - http://www.mpeg.org/

Provides several links to websites with information and/or resources related to MPEG.

:. Voice Separation .:

References

Kirlin, P., and P. Utgoff. 2005. VoiSe: Learning to segregate voices in explicit and implicit polyphony. Proceedings of the International Conference on Music Information Retrieval. 552–7.

This paper presents a system to extract voices in a symbolic representation of a music score using a same-voice predicate implemented as a learned decision tree and a hard-coded voice numbering algorithm. The system works on explicit and implicit polyphony.

Chew, E., and X. Wu. 2004. Separating voices in polyphonic music: A contig mapping approach. Proceedings of the International Symposium on Computer Music Modeling and Retrieval. 1–20.

This paper presents the contig (defined as a set of overlapping fragments of successive notes) mapping approach to voice separation that was used in the Java-based VoSA system. The algorithm is based on perceptual principles.

Kilian, J., and H. Hoos. 2002. Voice separation: A local optimisation approach. Proceedings of the International Conference on Music Information Retrieval. 39–46.

This paper presents a heuristic algorithm to find the best way to seperate voices. Their approach is not meant to find the correct answer for one given a problem but rather to find a reasonable solution in various contexts.

Temperley, D. 2001. The cognition of basic musical structures. Cambridge, MA: MIT Press.

A preference rule approach to voice seperation that was used in the Melisma Music Analyzer is described. Lots of of information is also available on the software website.

Cambouropoulos, E. 2000. From MIDI to traditional musical notation. Proceedings of the AAAI Workshop on Artificial Intelligence and Music.

A rule-based approach is briefly described in which the input is segmented into beats. Non-crossing streams are then created within these segments using a shortest path algorithm. Other work by Cambouropoulos is also closely-related to the subject and thus, definitely worth looking at.

Links

VoiSe - http://www-lrn.cs.umass.edu/

The Machine Learning Laboratory at University of Massachusetts Amherst: home of the VoiSe system. There isn't any real project website but the publication(s) can be downloaded from the lab website and there is a link to Philip Kirlin's website, the main person behind VoiSe.

VoSA - http://imsc.usc.edu/research/project/vosa/

The VoSA system webpage. It is part of the projects of the Integrated Media Systems Center of University of Southern California. Some posters can be downloaded with screenshots and publications can be downloaded from the parent website, but nothing more.

Melisma Music Analyzer - http://www.link.cs.cmu.edu/music-analysis/

Home of the Melisma Music Analyzer at Carnegie Mellon University. All information you need about the system is available from this website as well as a link to the FTP site to download the actual software.

:. Fingerprinting .:

References

Cano, P., E. Batlle, T. Kalker, and J. Haitsma. 2005. A review of audio fingerprinting. The Journal of VLSI Signal Processing 41: 271–84.

This paper provides an in-depth overview of audio fingerprinting and presents the different techniques involved in a unified framework. This paper really summarizes audio fingerpriting. A must-read!

Haitsma, J., and T. Kalker. 2002. A highly robust audio fingerprinting system. Proceedings of the International Symposium on Music Information Retrieval. 107–15.

The authors of this paper describe the audio fingerprinting system that they developed at Philips. This is without any doubt one of the main references in the field. Note that Gracenote now owns the audio fingerprinting technology developed by Philips.

Kalker, T., D. Epema, P. Hartel, R. Langendijk, and M. Van Steen. 2004. Music2Share: Copyright-compliant music sharing in P2P systems. Proceedings of the IEEE 92 (6): 961–70.

This paper describes a peer-to-peer architecture (Music2Share) that is meant for sharing music in a controlled and secure environment. The system makes use of both fingerprinting and watermarking, which opens doors to some very interesting possibilities. Worth a read!

Allamanche, E., J. Herre, O. Hellmuth, B. Froba, and M. Cremer. 2001. AudioID: Towards content-based identification of audio material. Presented at the 110th Audio Engineering Society Convention, Amsterdam, The Netherlands.

This paper presents the audio fingerprinting technology by Fraunhofer.

Doets, P., and R. Lajendijk. 2005. Extracting quality parameters for compressed audio from fingerprints. Proceedings of the International Symposium on Music Information Retrieval. 498–503.

This paper discusses issues and observations in regards to audio fingerprints that are degraded due to compression or other signal processing operations.

Doets, P., and R. Lajendijk. 2004. Stochastic model of a robust audio fingerprinting system. Proceedings of the International Symposium on Music Information Retrieval. 349–52.

This paper presents yet another method for audio fingerprinting based on a stochastic model.

Burges, C., J. Platt, and S. Jana. 2003. Distortion discriminant analysis for audio fingerprinting. IEEE Transactions on Speech and Audio Processing 11 (3): 165–74.

This paper presents the algorithms used in the RARE audio fingerprinting system by Microsoft.

Links

Shazam - http://www.shazam.com/

Shazam Entertainment is a company that provides a large music database (over 3.2 million) and fast and robust music recognition technology (using fingerprinting). The company services are aimed for mobile devices and are now offered in many countries worldwide.

Relatable - http://www.relatable.com/

Relatable provides a fingerprinting technology noticeably used in MusicBrainz.

Audible Magic - http://www.audiblemagic.com/

This company provides content protection using fingerprinting technology.

Gracenote - http://www.gracenote.com/

One very well-known company in the media field. They acquired Philips audio fingerprinting technology in 2005.

:. Watermarking .:

References

Gomes, L., P. Cano, E. Gomez, M. Bonnet, and E. Battle. 2003. Audio watermarking and fingerprinting: For which applications? Journal of New Music Research 32 (1): 65–81.

This paper presents applications of audio watermarking and fingerprinting beyond copyright protection. It also provides a good-enough overview of watermarking if you are not interested in learning all the cryptic details of the underlying algorithms.

Craver, S., M. Wu, and B. Liu. 2001. What can we reasonably expect from watermarks? IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics. 223–6.

Quite frankly, most of the content of this paper is probably summarized in the reference above, but I decided to include this paper here since it discusses the applications in an interesting way. Have a look at the "simple lock" idea...

Kim, H, Y. Choi, J. Seok, and J. Hong. 2004. Audio watermarking techniques. In Intelligent watermarking techniques, edited by J. Pan, H. Huang, and L. Jain, 185–219. River Edge, N.J.: World Scientific.

Good coverage of the different techniques for audio watermarking. There are several other papers out there, each of them presenting a single technique that is slightly different than a previous one, but honestly, this paper presnets the main (read broad) ones. Thus, if you want to get started at implementing watermarking systems, you want to read this one.