Assignment 2 - Audio Compression Techniques
Bibliography
This paper describes the the psychoacoustic phenomenon of auditory masking and its use in in audio coders (by allowing quantisation noise to be allocated in the various frequency subbands according to a masking function). The paper also presents a review of the MPEG-1 international standard for audio compression, and includes a short description of the psychoacoustic models that it uses. |
This paper (written by a researcher from Fraunhover Institute), is aimed at providing explanations for different MPEG standards. In particular, the structure, principles and features of MPEG-1 III and AAC are described in a lot of detail. However, it also provides general information about MPEG-4 and MPEG-7 formats. The paper talks about the factors determining the quality of compressed audio and some of the techniques that can potentially be misused in MPEG encoding and decoding, which negatively affects the audio quality. |
This paper (written by a researcher from Fraunhover Institute), is a general overview of the MPEG-1 standard. It provides a technical description of the standard in general, as well as more specific information for compression techniques used in its different layers. |
This article describes efficient audio compression techniques relevant to transmission of high-quality audio signals over the Internet. In particular, it describes MPEG- 2 Layer 3 audio compression standard, the different implementations of its encoders and decoders, and the use of those implementations for network transmission. |
This article describes some of the advances in common techniques used in speech and audio compression, based on emerging techniques in digital technology and their implementation in diverse commercial applications. The paper describes the LPC (Linear Predictive Coding) techniques and algorithms used in speech and wideband audio compression. It also provides a description of common subband and transform coding methods, used in combination with perceptual coding techniques, to achieve indistinguishable reconstruction of audio quality at bit rates of 128 kbps per channel. |
This article deals with lossless audio compression, which is implemented in several audio file formats used for music distribution over the Internet, DVD audio, digital audio archiving, and mixing. The paper presents a survey and classification of lossless audio compression algorithms. One of the important points mentioned in the study is that, according to the paper, lossless audio coders have reached a limit in what can be achieved for lossless compression of audio. The paper also describes AudioPak, a lossless audio coder with low algorithmic complexity and high performance characteristics compared to other lossless audio coders. |
This is a theoretical paper that describes the notion of perceptual coding, and talks about the progress up to date (in 1993) in the field of audio compression, as a result of advances in classical coding theory, modeling of human perception, and digital signal processing. The perceptual coding techniques that are mentioned in the article are the ones that are used in the MPEG-1 encoding standard. |
This article is interesting from a historical point of view, as it predates some of the popular audio compression encoders (such as MPEG-1 Layer III standard). The paper describes an audio encoder, that is designed using a psychoacoustically derived noise-masking threshold, and tested with a set of mono audio sounds sampled at 32 kHz. The work suggests that indistinguishable reconstruction is achieved for those sounds at 96 kbps. |
This paper describes some of the work done on audio compression using wavelets, as an alternative to using polyphase filter banks for frequency band separation. It also provides a short overview of the MPEG-audio coding standard that the described wavelet encoder is using. The paper also describes the wavelets and their use in audio compression from a more general point of view. |
The paper provides a detailed description of the main technologies and features of MPEG-1 and MPEG-2 audio coders, concentrating on description of advances in MPEG-2 and details on compatibility between the two standards. As part of the MPEG-2 overview, IMPEG-2 Advanced Audio Coding (AAC) layer is presented. The article also presents the MPEG-4 standard and talks about some of the typical applications for MPEG audio compression. |
This is an extensive overview of the use of perceptual techniques in digital audio. The paper starts with a description of psychoacoustic principles, while concentrating on providing a detailed overview of the MPEG psychoacoustic signal analysis model. It also talks about filter bank designs, and describes the modified discrete cosine transform, a filter bank that has become extremely popular in perceptual audio coding. Next, additional lossless audio encoding techniques are described in detail. The paper also provides extensive overviews of the ISO/IEC MPEG family (-1, -2, -4), the Lucent Technologies PAC/EPAC/MPAC, the Dolby AC-2/AC-3, and the Sony ATRAC/SDDS algorithms. Subjective evaluation methodologies for audio quality are also mentioned in the paper. |
This tutorial is aimed at describing the theoretical principles behind MPEG audio compression, concentrating on how the lossy-type algorithm can achieve indistinguishable reconstruction of signal quality on the basis of using the perceptual properties of the human auditory system. The article also describes the generic principles of psychoacoustic modeling and additional audio compression techniques. |
This paper provides a general description of the basic audio signal compression process and talks about some of the most popular audio compression techniques and algorithms. It also provides a good basic description of the capabilities of and theoretical principles behind different layers of the MPEG-1 audio compression standard. |
This article talks about the distortion effect, which is not present in the original audio signal but is produced by the human ear (through inter-modulation of a spectral complex), as a result of the non-linearity property of human hearing. The paper suggests that when psychoacoustic codecs remove masked components from an audio signal, they also remove the in-ear-generated distortion, and so the listening experience is modified. The paper suggests a method for quantifying, predicting and preserving the in-ear distortion in the audio signal. |
This paper describes Dolby AC-3, a flexible audio data compression technology that allows encoding a range of audio channel formats (from monophonic to 5.1) into a low rate bit stream. A complete overview of AC-3 technology and compression techniques is presented. Some of the techniques and features used by AC-3, which are described in the paper, are transmission of a variable frequency resolution spectral envelope and hybrid backward/forward adaptive bit allocation. |