Jump to content

Why Audio Compression is bad


NOSValves

Recommended Posts

Good read

http://www.popsci.com/science/article/2013-02/why-audio-compression-sounds-so-bad

Why Your Music Files Sound Like Crap

All of the compression algorithms are based on outdated understanding of how the human ear works.
By Martha HarbisonPosted 02.28.2013 at 12:02 pm8 Comments

Audio Data Compression
Audio Data Compression Moehre1992

Those music files -- be they MP3, AAC or WMA -- that you listen to on your portable music players are pretty crap when it comes to accurate sound reproduction from the original recording. But just how crap they really are wasn't known until now.

Audio data compression, at its heart, is pretty simple. A piece of software compresses a piece of digital audio data by chopping out redundancy and approximating the audio signal over a discrete period of time. The larger the sample time-period, the less precise the approximation. This is why an MP3 with a high sampling rate (short sample times) is of higher quality than an MP3 with a low sampling rate.

To test if the human ear was accurate enough to discern certain theoretical limits on audio compression algorithms, physicists Jacob N. Oppenheim and Marcelo O. Magnasco at Rockefeller University in New York City played tones to test subjects. The researchers wanted to see if the subjects could differentiate the timing of the tones and any frequency differences between them. The fundamental basis of the research is that almost all audio compression algorithms, such as the MP3 codec, extrapolate the signal based on a linear prediction model, which was developed long before scientists understood the finer details of how the human auditory system worked. This linear model holds that the timing of a sound and the frequency of that sound have specific cut-off limits: that is, at some point two tones are so close together in frequency or in time that a person should not be able to hear a difference. Further, time and frequency are related such that, a higher precision in one axis (say, time) means a corresponding decrease in the precision in the other. If human hearing follows linear rules, we shouldn't hear a degradation of quality (given high enough sampling rates -- we're not talking some horrible 192kbps rip) between a high-quality file and the original recording.

The Final Task
The Final Task: The researchers played three notes simultaneously. Red indicates the reference note. Green is the note that varies in frequency from the reference note. Blue indicates a note that happens at a different time than the reference note. Jacob N. Oppenheim and Marcelo O. Magnasco

The experiment was broken up into five tasks that involved subjects listening to a reference tone coupled with a tone that varied from the reference. The tasks tested the following:

1) frequency differences only
2) timing differences only
3) frequency differences with a distracting note
4) timing differences with a distracting note
5) simultaneously determining both frequency and timing differences

I don't think it will come as a surprise to a lot of audiophiles, but human hearing most certainly does not have a linear response curve. In fact, during Task 5 -- what was considered the most complex of the tasks -- many of the test subjects could hear differences between tones with up to a factor of 13 more acuity than the linear model predicts. Those who had the most skill at differentiating time and frequency differences between tones were musicians. One, an electronic musician, could differentiate between tones sounded about three milliseconds apart -- remarkable because a single period of the tone only lasts 2.27 milliseconds. The same subject didn't perform as well as others in frequency differentiation. Another professional music was exceptional at frequency differentiation and good at temporal differentiation of the tones.

Audio Overachievers
Audio Overachievers: The results from multiple test subjects completing task 5. Two subjects in particular stood out to the researchers, which are circled in pink. The circle at the left of the diagram shows the results from a subject who had extreme temporal acuity. He could tell the difference between two notes that were just three miliseconds apart. The circle at the bottom shows a subject who had extreme frequency acuity (and not too shabby temporal acuity as well), up to 13 times more accurate than a linear model predicted a human's ear could be. Jacob N. Oppenheim and Marcelo O. Magnasco

Even more interesting, the researchers found that composers and conductors had the best overall performance on Task 5, due to the necessity of being able to discern the frequency and timing of many simultaneous notes in an entire symphony orchestra. Finally, the researchers found that temporal acuity -- discerning time differences between notes -- was much better developed than frequency acuity in most of the test subjects.

So, what does this all mean? The authors plainly state that audio engineers should rethink how they approach audio compression -- and possibly jettison the linear models they use to achieve that compression altogether. They also suggest that revisiting audio processing algorithms will improve speech recognition software and could have applications in sonar research or radio astronomy. That's awesome, and all. But I can't say I look forward to re-ripping my entire music collection once those codecs become available.

Link to comment
Share on other sites

Thats a very interesting article. Thanks for posting it. Hopefully in time the compression issues go away based on more of this type of research & testing.

And the NBS preamp with VRDs will be ready for all the great uncompressed music that will come. [;)]

Link to comment
Share on other sites

I'll certainly take uncompressed audio any minute of any day, but when it's time to upload to YouTube and am forced to compress the file down, I've had excellent results (for compressed audio) using 2-pass VBR and selecting the highest instantaneous bit-rate the compressor allows (typically 256 kbps).

Starting with 48 kHz, 16-bit source material helps too.

It yields a render time that's twice as long, but the end result is audibly better than what AAC will deliver IME, granted no where's near the quality of the original file...especially after watermarking.

Link to comment
Share on other sites

Very very interesting Craig.

Not only speech recognition software, but telephone conversations themselves. Since the early days of digital, they have been compressing analog voice into data lines. I remember it used to sound like people on the other end were in a can.

Changing the codecs to non-linear might make the data demands on the telephone network which would discourage it's acceptance unless it was demanded by the consumer. Perhaps if Verizon or AT&T starts, others will follow... A new project for Elan Musk?

Link to comment
Share on other sites

This test is why YOU CAN"T TAKE US ANYWHERE. Every show I see, I note something 'wrong' with the performance or sound. I suppose after a few years of practicing sound system engineering, I've become acutely aware of timing, phase, and tonal variations. So much so that trucks driving by the house, aircraft, even minute noises the house makes drive me totally insane. I would live in an anechoic chamber if I could. Now.. time to go outside and enjoy the snow-draped scenery. I love snow, it quiets the world.

Link to comment
Share on other sites

Btw - compression isn't all bad. We use it all the time in live sound. Sometimes, sadly to make it 'sound like the record' but most often to squeeze some dynamics out of individual instruments so the sounds meld together better. Today's digital consoles have a gate and compressor on every single channel and most pro engineers use them all to some degree. Generally in live applications, it's just a slight ratio at a very slight threshold. If I squeeze in instrument's peaks down 3-5 db, that's enough for me. It's like having extra hands on the console. But satellite radio, Pandora, that stuff - I can't take it. The better your sound system the more you will hear the nasties caused by compression algorithms. I keep a stock of AICC tunes on my iPod for using to test systems, but the majority of my 'break' music is at a high bit rate MP3.

Link to comment
Share on other sites

Craig, if you'd just left the "Why" out of the title we wouldn't need to know anything else.

I suppose compression is required for those who don't care and want to mush all the freeze-dried tunes they can into their phone or whatever, but they probably can't tell the difference anyway.

For the rest of us it's simply unneccessary anymore. Storage is basically free. Bandwidth has passed the point for readilly handling uncompressed audio.

Just say "NO" to compression...

Dave

Link to comment
Share on other sites

Btw - compression isn't all bad. We use it all the time in live sound. Sometimes, sadly to make it 'sound like the record' but most often to squeeze some dynamics out of individual instruments so the sounds meld together better. Today's digital consoles have a gate and compressor on every single channel and most pro engineers use them all to some degree. Generally in live applications, it's just a slight ratio at a very slight threshold. If I squeeze in instrument's peaks down 3-5 db, that's enough for me. It's like having extra hands on the console. But satellite radio, Pandora, that stuff - I can't take it. The better your sound system the more you will hear the nasties caused by compression algorithms. I keep a stock of AICC tunes on my iPod for using to test systems, but the majority of my 'break' music is at a high bit rate MP3.

Data compression as applied to MP3 and other codecs is different than dynamics compression as it actually removes some information that is deemed unnecessary by the Motion Pictures Experts Group:

http://en.wikipedia.org/wiki/MP3

One big difference is that with MP3 the damage cannot be undone, it's gone forever. With dynamics compression a properly adjusted expander has a chance recovering the dynamic range on the recording in some cases.

Dynamics compressors can indeed help with multitrack recordings and live sound by helping match the individual instruments' dynamics, creating a better sounding mix. If a vocalist has poor mic technique it can keep their level more constant and make them sound better through the PA. Compressors are a valuable tool in the sound engineer's arsenal.

Link to comment
Share on other sites

Btw - compression isn't all bad. We use it all the time in live sound. Sometimes, sadly to make it 'sound like the record' but most often to squeeze some dynamics out of individual instruments so the sounds meld together better. Today's digital consoles have a gate and compressor on every single channel and most pro engineers use them all to some degree. Generally in live applications, it's just a slight ratio at a very slight threshold. If I squeeze in instrument's peaks down 3-5 db, that's enough for me. It's like having extra hands on the console. But satellite radio, Pandora, that stuff - I can't take it. The better your sound system the more you will hear the nasties caused by compression algorithms. I keep a stock of AICC tunes on my iPod for using to test systems, but the majority of my 'break' music is at a high bit rate MP3.

Data compression as applied to MP3 and other codecs is different than dynamics compression as it actually removes some information that is deemed unnecessary by the Motion Pictures Experts Group:

http://en.wikipedia.org/wiki/MP3

One big difference is that with MP3 the damage cannot be undone, it's gone forever. With dynamics compression a properly adjusted expander has a chance recovering the dynamic range on the recording in some cases.

Dynamics compressors can indeed help with multitrack recordings and live sound by helping match the individual instruments' dynamics, creating a better sounding mix. If a vocalist has poor mic technique it can keep their level more constant and make them sound better through the PA. Compressors are a valuable tool in the sound engineer's arsenal.

Link to comment
Share on other sites

Thanks for sharing this Craig.

I remember the bad ole days when computer resources were so scarce that many of us thought digital sound for the consumer would be a long way off. Its interesting that the enabling technology for this would be gotten wrong at first.

I did a hearing test when I was 18 and was able to fool the test by listening for the attack to discern higher or lower frequencies at very low volume so its very believable that the ear and brain could be trained. I am really excited that technology could be developed that would work with the way we hear sounds to improve the quality of music reproduced digitally.

I have a 17 year old musician son who related to the article well as he fights the ipod trash people want to play when he DJs for his high school.

It dawns on me that these observations would be very useful in hearing aid technolgy.

Link to comment
Share on other sites

I hooked my Ipod up to my stereo & found conservatively, the sound quality to be 5-10% below that of the same cd. I guess for some folks the convenience factor must be the driving force. For now I will stick with my cds.

Link to comment
Share on other sites

Btw - compression isn't all bad. We use it all the time in live sound. Sometimes, sadly to make it 'sound like the record' but most often to squeeze some dynamics out of individual instruments so the sounds meld together better. Today's digital consoles have a gate and compressor on every single channel and most pro engineers use them all to some degree. Generally in live applications, it's just a slight ratio at a very slight threshold. If I squeeze in instrument's peaks down 3-5 db, that's enough for me. It's like having extra hands on the console. But satellite radio, Pandora, that stuff - I can't take it. The better your sound system the more you will hear the nasties caused by compression algorithms. I keep a stock of AICC tunes on my iPod for using to test systems, but the majority of my 'break' music is at a high bit rate MP3.

I agree but some les knowledgeable may confuse the types of compression. Compression in recording I view as compressing the dynamic range of 'hot' voices or instruments wheter the compression is in the digital or analog world.

Data compression is taking out musical information to allow the storage of much more 'music time' in a smaller amount of digital storage.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...