Concluding the Great MP3 Bitrate Experiment

codinghorror · June 27, 2012, 12:00am

And now for the dramatic conclusion to The Great MP3 Bitrate Experiment you've all been waiting for! The actual bitrates of each audio sample are revealed below, along with how many times each was clicked per the goo.gl URL shortener stats between Thursday, June 21st and Tuesday, June 26th.

This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2012/06/concluding-the-great-mp3-bitrate-experiment.html

Jpotisch · June 27, 2012, 12:00am

Very interesting. Good experiment. For your next one, you can eliminate the advantage of the first and second links by randomizing the order each time you serve the page.

GianC · June 27, 2012, 12:00am

If you account for the fact that you did 5 comparisons instead of 1, you may find that the anomalous results suddenly become statistically insignificant, ala http://xkcd.com/882/. (It’s possible your spreadsheet accounted for that, didn’t check).

If you’re interested in these kind of things, check out HydrogenAudio and the Listening Tests forum there, they run these kind of public blind listening tests regularly and the result you got certainly isn’t a surprise. (There’s also some tools like ABC/HR that make running them much easier.)

Rossgrady · June 27, 2012, 12:00am

I stopped listening partway through the first sample, because I realized that I was hearing harsh digital artifacts & crappy-sounding digital reverb tails that were almost certainly present in the original “CD-quality” master. That album was recorded in the mid-80s, a particularly bad time to be making a big-budget album, because all the studios were buying early digital gear.

Might I suggest for round two that you pick a record that actually sounds really good on the CD master?

JohnM · June 27, 2012, 12:00am

You need to do ANOVA and then multiple-comparisons corrections using Tukey HSD.

R code looks roughly like this:

myaov <- aov( rating ~ quality, data=mydata )
summary(myaov) # See if anything’s sig.
TukeyHSD(myaov) # See what’s driving the significance with correction for multiple comparisons.

David_Hayes · June 27, 2012, 12:00am

You also haven’t accounted for what equipment people were listening on. There is quite a difference in what you’ll hear between el cheapo Apple earbuds (probably what most people used) and a decent pair of headphones/speakers. Bass response is usually the first to go. I have 3 (two Sennheiser, 1 Sony) pairs of phones on my desk and they all sound very different

CoreyH · June 27, 2012, 12:00am

It isn’t insane to think that a compressed version sounds better than the CD. Isn’t this the gist of why recording artists prefer analog tape over digital recordings - it smooths the sound out eliminating certain harsh transitions.

Quantumboy · June 27, 2012, 12:00am

I agree with Rossgrady. You should do another round with a better quality song. And randomize the order of the samples.

About the first sample being highest ranked:
If the source material isn’t so good there is a chance that the compression actually makes it better by filtering out bad stuff. Have you considered this?

Mike_King · June 27, 2012, 12:00am

Not that disagree with your conclusion, but I do have to take issue with the statement, “nobody can hear the difference between a 320kbps CBR audio file and the CD.” All this experiment proves is that on average people can’t tell the difference. To prove your original statement you’d need to show that given a large sample set of people there are none who can reliably select CD quality audio from 320kbps CBR audio given a selection of samples in different formats. A small number of people who have “dog ears” could easily be lost in the static. For what it’s worth, I think you are likely correct.

mdsharpe · June 27, 2012, 12:00am

I’d like to add another vote to the “better quality round” suggestion, as this is interesting subject matter but the choice of song kinda ruins the experiment up front.

Perhaps some precision-crafted electronic music would better reveal the subtle artifacts introduced by compression than some ancient and, by your own admission, awful cheesy pop?

Seth_Heeren · June 27, 2012, 12:00am

What @Mike said, precisely. Only averages are shown here. Exceptionally good ears are completely disregarded, as are exceptionally bad ears.

even dogs won’t be able to hear the difference between 192kbps VBR MP3 tracks and the original CD

Well, I did. I admit it would have taken me more than the 15 seconds of listening to distingish the 320kbps from the flac, but 192kps was significantly unpleasant to listen to. But I had the two top-quality clips right within mere seconds of listening. Without a shred of doubt.

I’m absolutely convinced that if you did an actual test to see how much more accurate some people would classify this than others, you’d find that there are a lot of better-than-dogs persons.

Regardless, this was very enlightening, because I would have guessed everybody would spot the quality differences without a hitch.

Wow.

Skyborne · June 27, 2012, 12:00am

I was almost right: you didn’t sneak in anything lower than 128 Kbps, but you used CBR and it’s VBR that I did my test with.

Other than that, I guess the other four are transparent, since I rated Brie top and my notes say I couldn’t figure a difference between Cheddar and Limburger. =)

Stannius · June 27, 2012, 12:00am

Hellooooo confirmation bias!

Keith_Grant · June 27, 2012, 12:00am

You’ll notice the number of downloads is almost directly proportional with the length of the cheese name given to each sample. Since longer words mean longer (and thus larger) links on the page, I think you’ve definitively proven we’re all just monkeys randomly clicking the screen, giving us a greater chance of hitting the bigger targets.

Mdsf · June 27, 2012, 12:00am

I’d like to see whether there’s a set of people who did rank these in the correct order, and to see whether they could do it again, and again, on some more blind tests.

If they failed in accordance with pure chance on subsequent tests, then I’d believe the claimed result.

For now, I take it as a provisional likelihood.

MilanJ · June 27, 2012, 12:00am

Interesting experiment, but unsuprising results. I have just two things to say - first: the choice of the song was bad, you really can’t tell audio quality of a rock song (many of the instruments use distortions, are intentionally badly recorded etc.). You should choose something with high dynamic range - either something classical or electronical (and I don’t mean club dance music by that).
Second - most of the listeners won’t have speakers/monitors/headphones with sufficient sound quality to distinguish the difference of bitrates over 192kbps. I personally think that 192kbps is absolutelly good for listening if you play it at standard volume. Only at high volumes and on good equipment are you able to tell the diffrence.
So why struggle to get better quality? Because people are emotional - it is technically almost identically hard to produce low or high bitrate files (or loseless at that matter) once you have clean master, so you naturally want to have the best thing you can. And if you are a musician than you might want to sample it somedays). P.S.: Sorry for my english.

Aaron_Em · June 27, 2012, 12:00am

“precision-crafted electronic music”

Die horribly.

No, what you want for this is an all-digital master of one of the best modern performances of one of the world’s greatest works of art – http://www.amazon.com/Beethoven-Symphony-No-9-Choral/dp/B000001G6W – or, for those unwilling to follow links, the Berlin Philharmonic, under von Karajan, performing Beethoven’s Ninth. (If I’m reading this right, it’s the 1977 recording, digitally remastered in 1990. Pace whoever, with probably good reason, talked about early digital recordings – I’ve got a halfway decent ear, I guess, for someone not rich enough to get away with ‘audiophile’, and I can’t pick out any artifacts in this one. I bet you can’t either.)

Ever wonder why CDs hold as many minutes of music as they do? Because Sony wanted to make sure this performance would fit on one disc. If they thought it was so worthwhile that it was worth designing the entire format around, who are you to say otherwise? No one, that’s who. Put that “Skrillex” garbage down for an hour and treat your abused ears to one of the most transcendently beautiful experiences this planet has to offer, and get you some desperately needed culture besides.

Sharad · June 27, 2012, 12:00am

STEREO SEPARATION and dolby pro logic encoding. The higher bitrates have a greater degree of stereo separation and retain more dolby pro-logic encoding.

This goes to personal preference. Do you want the sound to be more ambient or do you want similar sounds to seem like they are more centered? So, with less stereo separation, a lower bitrate can have slightly better vocals of the lead singer because more of those vocals are produced in mono at both the left and right speaker.

PrashantP · June 27, 2012, 12:00am

Everyone who claims that they need more than 192kbps will happily admit that they are part of a small minority and also that it may not apply to all types of music.

Therefore, the question is not really whether the population at large can hear the difference, but whether these individuals can hear the difference.

The current experiment design can not answer this question. In the spirit of Popper and falsification you must try your hardest to prove your own theory wrong by giving these individuals the best possible chance of showing that they can hear a difference.

So a better experiment would mean:

The listener should get to choose the songs.
The songs must be somewhat representative of the listener’s music collection (not a collection of sounds artificially designed to reveal encoding artefacts).
Each listener should be considered on their own, not in aggregate, since we are measuring a human ability. This means that statistical significance has to come from the same individual performing the test several times for several songs.

At the end of this experiment you may find that a small proportion of people can hear a difference with statistical significance. These people would then be within their rights to argue against encoding at 192kbps.

I don’t blame you for doing a simple experiment. I wouldn’t have thought of the above points the first time either, and they may be some more to consider. But you should accept that this experiment doesn’t conclusively answer the relevant question. The best experiment would be much more difficult and expensive to do. Unfortunately, it’s quite common some areas of science for the cheap and convenient experiment to be deemed conclusive when it isn’t.

(Personally, I probably can’t tell the difference beyond 192kbps VBR but I wouldn’t be surprised if some people could.)

MilanJ · June 27, 2012, 12:00am

One more thing - some mp3 players (almost all cowons, some sandisks, some sonys) have reconstructive algorithms which try to make the sound quality of mp3 better, so it is possible that those mp3 enhanced by these algorithms might sound nicer to human listener than dry CD quality audio (which is not enhanced by the algorithms because of sufficient bitrate). It has nothnig to do with your experiment though.