Recent Articles



































MP3



         


MP3 (or, more precisely, MPEG-1/2 Audio Layer 3) is an audio compression algorithm capable of greatly reducing the amount of data required to reproduce audio, while sounding like a faithful reproduction of the original uncompressed audio to the listener.

[Top]

History

MPEG-1/2 Layer 2 encoding started life as the Digital Audio Broadcast (DAB) project initiated by Fraunhofer IIS-A. This project was financed by the European Union as a part of the EUREKA research program where it was commonly known as EU-147.

EU-147 ran from 1987 to 1994. In 1991 there were two proposals available: Musicam (known as Layer II) and ASPEC (Adaptive Spectral Perceptual Entropy Coding) (with similarities to MP3). Musicam was chosen due to its simplicity and error resistance.

A working group around Karlheinz Brandenburg and Jürgen Herre took ideas from Musicam and ASPEC, added some of their own ideas, and created MP3, which was designed to achieve the same quality at 128 kbit/s as MP2 at 192 kbit/s.

Both algorithms were finalized in 1992 as part of MPEG-1, the first phase of work by MPEG, which resulted in the international standard ISO/International Electrotechnical Commission 11172-3, published in 1993. Further work on MPEG Audio was finalized in 1994 as part of the second phase, MPEG-2, which resulted in the international standard ISO/IEC 13818-3, originally published in 1995.

Compression efficiency of Lossy data compression encoders is typically defined by the bitrate, because compression rate depends on bit depth and sampling rate of the input signal. Nevertheless there are often published compression rates which use the CD parameters as references (44.1 kHz, 2x16 bit). Sometimes also the DAT SP parameters are used (48 kHz, 2x16 bit). Compression ratio for this reference is higher, which demonstrates the problem of the term compression ratio for lossy encoders.

Karlheinz Brandenburg used Suzanne Vega's CD Tom's Diner as his model for the mp3 compression algorithm. This CD was chosen because of its softness and simplicity, making it easier to hear imperfections in the compression format during playbacks.


FhG publish on their official webpage the following compression ratios and data rates for MPEG-1 Layer 1, 2 and 3, intended for comparison:

These values are probably overly optimistic (which is likely to be influenced by public relations, that is to say, they want to hype Layer 3) because the quality depends not only on the encoding file format, but also on the quality of the psycho acoustic algorithms used by the encoder. Typical layer 1 encoders use very simple psycho acoustics which result in a higher needed bitrate for transparent encoding.

That is to say, the assumed bitrates are not equivalent in quality and the qualities are not necessarily optimal (It is generally agreed that 112 to 128 kbit/s Layer 3 is not excellent sound) and therefore the comparison is probably not very reliable as an objective source.


More realistic bitrates are:

Comparing a new file format typically is done by comparing a medium quality encoder of the old format and a highly tuned encoder of the new format.

The MP3 format uses, at its heart, a hybrid transform to transform a time domain signal into a frequency domain signal:

In terms of the MPEG specifications, AAC from MPEG-2 is to be the successor of the MP3 format. In practice, however, due to numerous patenting and licensing issues with various parts of the MPEG specifications, there has been a significant movement to create and popularise audio formats and/or algorithms which lack that significant problem. The most popular of these is probably Ogg Vorbis, which seems positioned to be the mostly likely successor (compared to any other format) to MP3 as the popular format for audio interchange. Nevertheless, any 'succession' is not likely to happen for a significant amount of time. MP3 enjoys very significant and extremely wide popularity and support, not just by end-users and software but by hardware such as DVD players (Note: Ogg Vorbis is also the format used for sounds in the BambooWeb).


[Top]

MP2 and MP3 and the Internet

In October 1993, MP2 (MPEG-1 Audio Layer 2) files appeared on the Internet and were often played back using Xing MPEG Audio Player, and later in a program for UNIX by Tobias Bading called MAPlay initially released on February 22 1994. (MAPlay was also ported to Microsoft Windows.) Initially the only encoder available for MP2 production was the Xing Encoder, accompanied by the program CDDA2WAV, a CD ripper that copied CD audio to hard disks.

In the first half of 1995, MP3 files, file representations of MPEG-1 Audio Layer III data, began flourishing on the Internet. Its popularity was mostly due to, and interchangeable with, the successes of companies and software packages like Nullsoft's Winamp, mpg123 and the now Roxio-owned Napster.

Controversies regarding peer to peer file sharing, largely of MP3 files (due to the high compression which enables sharing of files that would otherwise be too large and cumbersome to do so), have flourished in recent years.


[Top]

Quality of MP3 audio

Many listeners accept the MP3 bitrate of 128 kilobits per second (kbit/s) as near enough to CD quality for them; this provides a compression ratio of approximately 11:1, although listening tests show that with a bit of practice, many listeners can reliably distinguish 128 kbit/s MP3s from CD originals. To some listeners, 128 kbit/s is unacceptably low quality. Even though differences may be perceptible, this is acceptable for some listeners in some listening environments, such as a noisy car or train.

A few possible encoders:

Some early encoders are not widely used any more: ISO dist10 reference code, Xing, BladeEnc, and ACM Producer Pro.

The quality of MP3 depends on the quality of the encoder and the difficulty of the signal which must be encoded. Good encoders produce acceptable quality at 128 - 160 kbit/s. Near transparence is achieved at 160 - 192 kbit/s. Low quality encoders may never reach nearly transparent mode, not even at 320 kbit/s. So it is pointless to speak of 128 kbit/s or 192 kbit/s quality. A 128 kbit/s MP3 produced by a good encoder might sound better than a 192 kbit/s MP3 file produced by a bad encoder.

Additionally, it is very important to note that this is quite subjective. For some listeners, this is adequate, for others, more quality may be necessary. The numbers given above are rough guidelines that work for many people, but in the field of lossy audio compression, the only true measure of the quality of a compression process is to listen to the results.

An important feature of MP3 is that it is lossy — meaning that it removes information from the input in order to save space. As with most modern lossy encoders, MP3 algorithms work hard to ensure that the sounds it removes cannot be detected by human listeners, by modeling characteristics of human hearing such as noise masking. The importance of this is that it can gain huge savings in storage space with reasonable and acceptable (although detectable) losses in fidelity.

A few experienced listeners can tell the difference from the original at 192 kbit/s, and even at 256 kbit/s on some of the less powerful (and obsolete) encoders. If your aim is to archive sound files with no loss of quality, you may be more interested in lossless audio compression such as Monkey's Audio, FLAC, SHN, or LPAC — these will generally compress a 16-bit PCM audio stream to approximately 51-80% of the original size (depending upon the characteristics of the audio itself), with no loss of quality.

[Top]

Bit Rate

The bit rate, i.e. number of binary digits streamed per second, is variable for MP3 files. The general rule is that the higher the bitrate, the more information is included from the original sound file, and thus the higher is the quality of played back audio. In the early days of MP3 encoding, a fixed bit rate was used for the entire file.

Bit rates available in MPEG-1 layer 3 are 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256 and 320 kbit/s (<math>10^3<math> bits per second), and the available sample frequencies are 32, 44.1 and 48 kHz. 44.1 kHz is almost always used as this is the audio CD frequency, and 128 Kbit is some sort of de facto "good enough" standard. MPEG-2 and (non-official) MPEG-2.5 adds more bitrates: 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160 kbit/s.

Variable bitrates are also possible. Audio in MP3 files are divided into frames which all have a bitrate marker, so it is possible to change the bitrate dynamically as the file is played. This was not originally done, but VBR is in extensive use today. This technique makes it possible to use more bits for parts of the sound with high dynamics (much "sound movement") and less bits for parts with low dynamics, increasing quality and decreasing storage space further. This method compares to a sound activated tape recorder which saves the tape space from when silence was prevalent for the times when sound is being heard. Some encoders utilize this technique to a great extent.

[Top]

Design limitations of MP3

There are several flaws in the MP3 file format which cannot be fixed by a good encoder. These flaws are inherent properties of the MP3 file format. Newer codecs such as Vorbis, AAC, Musepack or WMA v9 have fixed these flaws.

[Top]

Encoding of MP3 audio

The MPEG-1 standard does not include a precise specification for an MP3 encoder. The decoding algorithm and file format, as a contrast, are well defined. Implementors of the standard were supposed to devise their own algorithms suitable for removing parts of the information in the raw audio (or rather its MDCT representation in the frequency domain). This is the domain of Psychoacoustics, which aims at understanding how human acoustical perception works (both in our ears and in our brain).

As a result, there are many different MP3 encoders available, each producing files of differing quality. Comparisons are widely available, so it is easy for a prospective user of an encoder to research the best choice. It must be kept in mind that an encoder (such as LAME, which is in widespread use for encoding at higher bitrates) is not necessarily as good at other, lower bitrates.

[Top]

Decoding of MP3 audio

Decoding, on the other hand, is carefully defined in the standard. Most decoders are "bitstream compliant", meaning that they will each produce exactly the same uncompressed output from a given MP3 file. Therefore, for the most part comparison of decoders is almost exclusively based on how efficient they are -- that is, how much memory or CPU time they use in the decoding process.

[Top]

ID3

See main article ID3

ID3 is a tagging format that allows information such as the title, artist, album, or track number of the MP3 to be added to the file.

[Top]

Competing tag formats and replay gain

A few proposed standards for encoding the replay gain of an MP3 file has been proposed. The idea is to normalize the volume of files replayed after each other, so that the volume does not fluctuate up and down between tracks, harming the listening experience. None of the proposals have caught on.

The currently most widespread "standard", as implemented in Foobar2000, is to append an APEv2 tag, stored at the end of the file, much like an ID3 tag. This standard is known simply as "Replaygain". APEv2 is actually a competing format to ID3v1/v2 originally developed for the MPC file format (See ) but can coexist with ID3 tags in the same file.

[Top]

Similar/interchangeable technologies

Many other lossy audio codecs exist, including:

mp3PRO, MP3, AAC, and MP2 are all members of the same technological family and depend on roughly similar psychoacoustic models. The Fraunhofer Gesellschaft owns many of the basic patents underlying these codecs, with Dolby Labs, Sony, Thomson Consumer Electronics, and AT&T holding other key patents.

There are also some non-lossy (lossless) audio compression methods used on the internet. While they are not similar to MP3, they are good examples of other compression schemes available. These include:

MP3, which was designed and tuned for use alongside MPEG-1/2 Video, generally performs poorly on monaural data at less than 48 kbit/s or in stereo at less than 80 kbit/s.

Though proponents of newer codecs such as WMA and RealAudio have asserted that their respective algorithms can achieve CD quality at 64 kbit/s, listening tests have shown otherwise; however, the quality of these codecs at 64 kbit/s is definitely superior to MP3 at the same bandwidth.

Thomson claims that its mp3PRO codec achieves CD quality at 64 kbit/s, but listeners have reported that a 64 kbit/s mp3PRO file compares in quality to a 112 kbit/s MP3 file and does not come reasonably close to CD quality until about 80 kbit/s.

The developers of the Ogg Vorbis algorithm claim that their algorithm surpasses MP3 and WMA Standard sound quality, and provide listening tests to attempt to prove that claim.

[Top]

Licensing and patent issues

Thomson Consumer Electronics controls licensing of the MPEG-1/2 Layer 3 patents in countries such as the United States of America and Japan that recognize software patents. Thomson has decided to attempt to collect royalties for the patents.

In September 1998, the Fraunhofer Institute sent a letter to several developers of MP3 software stating that a license was required to "distribute and/or sell decoders and/or encoders". The letter claimed that unlicensed products "infringe the patent rights of Fraunhofer and THOMSON. To make, sell and/or distribute products using the [MPEG Layer-3] standard and thus our patents, you need to obtain a license under these patents from us."

These patent issues significantly slowed the development of unlicensed MP3 software and led to increased focus on creating and popularising alternatives such as WMA and Ogg Vorbis. Microsoft, the makers of the Windows operating system, chose to move away from MP3 to their own proprietary Windows Media formats to avoid the licensing issues associated with the patents. Until the key patents expire, Open Source Software / Free Software encoders and players appear to be illegal. Note that the BambooWeb does not accept MP3 formats, and probably will not until the patents expire, since doing so would require users to use a patent-encumbered format.

For information about licensing fees see and .

In spite of the patent restrictions, the perpetuation of the MP3 format continues; the reasons for this appear to be the network effects caused by:

[Top]

Online music resources

Tools such as iRate try to make it easier to find music that matches the listener's tastes. There are several online music stores. Apple's iTunes store is presently the most popular commercial online music offering. A controversial MP3 portal is the Russian site AllOfMP3.com, which offers downloads of thousands of albums and video clips by mainstream artists, priced at $10 per gigabyte. There are also several online columnists who edit news sites focused on digital music and the grassroots community it spawned. They include Richard Menta's , an early MP3 news site started in 1998, Jon Newton's , and Thomas Mennecke's

[Top]




  View Live Article   This article is from Wikipedia. All text is available under the terms of the GNU Free Documentation License