Jump to content

Double Blind testing is like fishing without knowing what kind of fish you're after!


russ69

Recommended Posts

Suppose you get 100 subjects into a DBT and see if they can detect the difference between 435 Hz and 445 Hz. We all agree this can easily be measured, and thus we all agree they are different tones, right? Suppose 98 out of the 100 can't hear the difference. Now we tell the piano tuner he can be off by 2% tuning each key of a grand piano because "no one can tell the difference." He does, and someone sits down to play Clair De Lune. Can the 98 people hear the difference now?

I don't know. But what's at issue as I see it isn't what a theoretical "98 out of 100" hear but what do you and I hear? And can it be demonstrated?

Speaking of which the sceanario you paint hasn't been demonstrated but the difficulty in telling differences between certain types of gear has.

Link to comment
Share on other sites

  • Replies 100
  • Created
  • Last Reply

Top Posters In This Topic

>Would a DBT between a small bookshelf speaker and a Klipschorn show that most would be able to find a difference?

OK, so I lied and there isn't much doing. Couldn't help but pick on this one.

An interesting thought. I have bookshelf speakers, and I won't mention which, that I think would show no difference from my Klipschorns on a string quartet at a realistic (same as performance) volume.

Dave

Link to comment
Share on other sites

It's good to see that TBrennan is back. Hope you're well Tom.

For the record here.

During the first pilgrimage to Indy the good people there had an ABX on speaker wire. IIRC this was zip cord versus home made woven CAT-5. Also IIRC this was originally planned to be Monster Wire versus something else. But this was not done.

There were a selection of CD's. I chose pink noise. I might have brought that CD with me.

My thought was and is that ABX testing is far too tough in the ususal setting with music. You are switching successively between different times in the recording. Say: time span 1 versus timespan 2 and then 3 versus 4 and 5 versus 6, etc. In music notation, those would be different bars. In view of the constantly shifting target, I don't find it odd that so many people / systems flunk the ABX test.

That could be solve by using the same 3 second sample restarted with every button push. More easy to do these days through the magic of computers, but I have not read of it.

The alternative is just pink noise. That way the program is the same.

One intersting report is here http://philsaudio.com/abx.htm regarding caps and noise.

BTW, I could not detect a difference in the pink noise with the two different types of wire. It wasn't worth even writing it down. In my recall, no one else in that testing tried pink noise.

We'll have to read deeper about Dr. Toole's testing.

Wm McD

Link to comment
Share on other sites

We have two camps. In the one are technicians paying obescience to the DBT and seeking to show the "sameness" of all things audio. In the other camp is the audiophile paying attention to the intuition of the senses and seeking to show the uniqueness of all things audio. The technician seeks to commodify, genericize and reduce to common denominators - "all amps sound the same". The intuitive seeks to expand, experiment, explore and use subjective art-related approaches - this heavy rock on my CDP improves focus. The technician sees subject seperated from object, the intuitive sees subject and object as inseparable parts of a whole.

They end up in the same room (here) because they use the same tools, software and to some degree language, even though their objectives for the hobby are a world apart. And, when you go and listen to each of their systems in their homes, this whole difference in the two radically different audio cultures becomes immediately obvious. But here, over mere words and images, it's not that obvious.

Only two camps? This reminds me of the old joke "there are two kinds of people in the world, those that divide people into two different kinds and those who don't". There are people who discard some audiophile notions because their minds are open not closed. Personally I don't belong to either faith--that nothing makes a difference and everything makes a difference both seem overly dogmatic and based on faith and personal expectations for the hobby.

You imply that the adherants of those two faiths will have different sounding hi-fis, I don't think so, I don't see why, my experience is that both technical and non technical people can own good sounding hi-fis.

Link to comment
Share on other sites

my problem with double blind testing of any component is that it's more of a test of human memory, which we know is very unreliable.

If you can't remember the difference for a few seconds, then what does it matter a few weeks later?

Link to comment
Share on other sites

I wanted to illustrate blind testing with an example. Let's say we have a photograph of a very well hidden tiger in the brush. Now if we ask people, what do they see in the photograph many might not see the tiger. On the other hand if we say; Can you see the tiger in the photograph, most people will know what to look for and find the tiger. Listening is the same way, if you help people to locate what they are listening for they can focus on the task and identify changes more accurately.

Now take a photograph with no tiger present and then ask people if they can see the tiger.....how do you account for the people claiming they see something that doesn't exist?

Link to comment
Share on other sites

"You are switching successively between different times in the recording...the constantly shifting target..."

A real problem. I wonder if single tones might not work better. John Atkinson claims massed choral works do a good job of revealing differences.

Mark, you brought up the problem related to our short auditory memory, but the DBT process is to supposed to help offset that. I think the reason it doesn't work is because listening to music has more to do with long term memory.

Link to comment
Share on other sites

my problem with double blind testing of any component is that it's more of a test of human memory, which we know is very unreliable.

 

If you can't remember the difference for a few seconds, then what does it matter a few weeks later?

With that reasoning, we'd all be listening to a Bose Wave radio.
Link to comment
Share on other sites

The following is from another forum.

This is always a swamp. It seems to me that double blind testing is often useful - but its conclusions are often overstated, and possibly even outright misunderstood, by its proponents.

For example, the situation Sean talks about is an obvious correct application of double blind testing. Sighted bias is very real - that's how people come to conclude that putting a device the size of a hockey puck on their wall has a major impact on the acoustics of the room. DBT eliminates this sight bias in tests of preference amongst dissimilar loudspeakers - very good. But what do the results of that test prove? That A is better than B? Well, only for the conditions under test. A might sound better than B in this room, over this span of time, but it doesn't mean you'll necessarily like A after a few months in your own room, just as you might not like that TV with 9K color temperature so much as you did in the store. It's important to not over represent the significance of any test.

Where things get more controversial, for me, is when DBT is used to assert the audibility, or lack of same, of small physical differences. The benefit of DBT is that it's supposed to maximize the effect of short-term auditory memory, on the assumption that short-term memory is best for measuring the capacity of the low-level auditory mechanisms. And there's plenty of science to back up that assumption, using tests like, for instance, comparing two pitches, or the relative levels of two signals.

But listening to music has another, important dimension - long term memory, and long term learning. Music involves not just detecting sound, but objectifying sound: this kind of instrument, playing in that kind of space, in that kind of relation to other instruments, etc. The more experience we have in listening to particular kinds of music, particular sounds, in different kinds of contexts, the better we get at objectifying similarities and differences. If you're listening to your audio system in your room, that you're heard many times, with a recording you've heard many times, you're more likely to detect small variations from the "usual sound" than a person who's entirely unfamiliar with your system and recording. And if you're familiar with critical listening, you're more likely to detect differences in unfamiliar situations. It seems to me that the design of most DBTs minimizes the role of cognition, experience, learning, whatever you want to call it.

To Sean's example, let's say speaker A and speaker B are the same basic design, with minor modifications to the crossover. A group of consumers brought in for a DBT finds no statistically significant difference. But an experienced speaker designer insists there is a difference. If you give that designer a week to do sighted comparisions between A and B, with the system and room under test, using all the recordings he normally uses for evaluation, would you not expect at the end of that time for the designer to have a better chance of differentiating A from B in a DBT than the consumer group? And yet, the results of DBT tests are rarely characterized as "under the time frame of this test, with listeners of the experience level chosen, the test did not produce statistically significant results". The role of experience and learning is usually discounted, and the results of the test used to "prove" what credulous morons audiophiles all are.

Bob Stuart of Meridian, who has degrees in both audio engineering and pyschoacoustics and has been as influential as any single person in the field of digital audio, has some interesting observations about this in a recent interview/lecture he gave to the UK audio engineering society. The MP3 audio recording of that interview can be found on this page:

http://www.aes.org/sections/uk/meetings/a0812.html

Starting at about 1:24:00 in, he talks about A/B testing and its limitations. There's also a fascinating section about digital audio, digital formats, and the design of Meridian's version of minimum-phase digital filtering, starting about 44 minutes in. It's not quite the same without seeing his slides, but still very interesting.

Link to comment
Share on other sites

Suppose you get 100 subjects into a DBT and see if they can detect the difference between 435 Hz and 445 Hz. We all agree this can easily be measured, and thus we all agree they are different tones, right? Suppose 98 out of the 100 can't hear the difference. Now we tell the piano tuner he can be off by 2% tuning each key of a grand piano because "no one can tell the difference." He does, and someone sits down to play Clair De Lune. Can the 98 people hear the difference now?

People would probably not be able to tell the difference because the difference is only 10 Hz and most people can't hear below 20 Hz. [;)]

'Wagner's music is better than it sounds." - Mark Twain

Link to comment
Share on other sites

If you want to see something really depressing (for me), check out the beating I took on the AVS forum this week.

.... They wanted a lot of measurements and DBT under my belt to support the subjective improvement I was claiming -- and I couldn't provide either. I didn't do well. ...

I read most of that thread and thought you did well. It seemed like some of the posters on that thread just wanted to get you to jump through some hoops for their own entertainment and they had no real interest in Klipsch speakers or improvements to the XOs. It means more to me to have an honest testimonial from someone who can give a before and after evaluation of an upgrade. Converting a room full of skeptics during a ten minute listening session says a lot more than a stack of charts to me.

Link to comment
Share on other sites

With that reasoning, we'd all be listening to a Bose Wave radio.

You really think so? I know I wouldn't be.

I thought it was obvious that I was using an extreme example, as I know you and I have walked this path before in other threads. But, let me explain. Because human memory is unreliable for the purposes of DBT, the subtle nuances that differentiate good cd players from great cd players, or one amp from another, or one cable from another is lost in a DBT. Such small differences need to be heard over a long period, in ones own system, and with familiar music.
Link to comment
Share on other sites

Because human memory is unreliable for the purposes of DBT, the subtle nuances that differentiate good cd players from great cd players, or one amp from another, or one cable from another is lost in a DBT. Such small differences need to be heard over a long period, in ones own system, and with familiar music.

When memory doesn't give pleasing results you assert memory is unreliable. But even differences supposedly identified through long term listening depend on memory----if you didn't remember how things sounded before this long term listening how could you identify a change? Are you asserting that memory is reliable when identifying changes over long periods of time but not over short ones?

I think that the longer the period of time the less reliable memory is and the more time one has to rationalize and obfuscate.

It's interesting that many of the changes people are unable to identify in DBTs are claimed to be dramatic and not subtle.

Link to comment
Share on other sites

Hi Tom, nice to see you back.

"When memory doesn't give pleasing results you assert memory is unreliable."

The DBT process is based on the premise that auditory memory is unreliable, and was designed to minimize or mitigate the effects.

"But even differences supposedly identified through long term listening depend on memory----if you didn't remember how things sounded before this long term listening how could you identify a change? Are you asserting that memory is reliable when identifying changes over long periods of time but not over short ones?"

DBT is based on the assumption that it's short term memory at work during the process of identifying subtle but often significant differences. What if that's not completely true? Short term and long term memory don't use the same parts of the brain, and if identifying differences also involves long term memory, then the DBT process has a problem. It would certainly explain why people can't identify differences between things they know exist based on previous exposure to the products. We are "familiar" with things because we spent time with them and learn about them, and this information is pulled back out from the long term memory center of our brain when we need it. If I'm not around something for a while, I need some time to refamiliarize myself with it. DBT might actually "short circuit" our ability to recall information, since no real amount of time is give to learn and process.

An interesting DBT experiment would be to include a "learning period". For example, give someone a full day with two amplifiers and the music of their choice, and then do the DBT at the end of day.

Link to comment
Share on other sites

Guys/Gals,

Would you go fishing without knowing what your fishing for? I wouldn't.

Thanx, Russ

Quite frankly, this is one of the dumbest arguments I've ever heard, regarding, double blind testing, testing in general, or in regards to fishing. You need to get yourself enrolled into Logic 101 at your local Community Collge fast.

Double blind testing has nothing to do with fishing or knowing what kind of fish you're looking before you go fishing. Now, IF, you were trying to find out the accuracy of one's palate, say, for tasting and then possibly knowing what kind of fish you are eating ~ then double blind testing is certainly of some use in determining whether or not you can tell the difference, and possibly discern what kind of fish it is.

Quite frankly, EVERYONE, I've ever encountered, who is "against" double blind testing, are also against having a reality check which might prove they are not in control of deterrmining that they are right or wrong. Why, who could argue against them? It is what they hear and no one else can hear it exactly the same so therefore no one could possibly prove otherwise! They are right. End of story.

The fact of the matter is there is ONE and only ONE way to determine the accuracy of a playback/recording system. And for the many years I've been hanging here, there are hardly a handful of us who have done the required exercise. If you haven't done that, then your opinion is just like, well, an ***h*le ~ everyone has one, BFD. If all you care about is what "sounds good" to you (which is valid, for you), then it doesn't matter and neither should any "test" of any kind. Just sit back, enjoy, and forget about tests, accuracy and high fidelity or what anybody else thinks. [|-)]

Link to comment
Share on other sites

With that reasoning, we'd all be listening to a Bose Wave radio.

You really think so? I know I wouldn't be.

I thought it was obvious that I was using an extreme example, as I know you and I have walked this path before in other threads. But, let me explain. Because human memory is unreliable for the purposes of DBT, the subtle nuances that differentiate good cd players from great cd players, or one amp from another, or one cable from another is lost in a DBT. Such small differences need to be heard over a long period, in ones own system, and with familiar music.

What you're suggesting (and it's something I agree with) is that not all of the differences between pieces of gear are always audible. You have to have source material that brings out that difference...and sometimes it may only be a few notes in a piece that will demonstrate the difference. Heck, sometimes you gotta be in the right listening mood to hear it too.....most audiophiles get so up tight when doing blind listening because they're too worried about their ego to enjoy the music.

With that in mind, there is no time limit on a DBT. If you think you would have more success flipping between A and B once every week, then by all means conduct the test that way. Another thing you might try is find that short segment of a piece that brings out the difference and then use that in your 'quick' AB. This is why I keep suggesting to have someone change your gear on you without you knowing. If you can identify without your eyes that something changed, even if it took a few weeks, then you have demonstrated a perceptible sonic difference. And heck, we're only looking for a difference....identifying which is better is a whole different issue. The only way acoustic memory matters is in being able to identify the model #'s of the different gear.

So all that said, I don't agree with your extreme example....even in the most conservative extrapolation of it. If your eyes change your hearing perception, then you are simply not a refined listener.

Link to comment
Share on other sites

The DBT process is based on the premise that auditory memory is unreliable, and was designed to minimize or mitigate the effects.

How does DBT mitigate the effects of auditory memory? All it does is require you to not know what you're hearing. That has nothing to do with auditory memory.

DBT is based on the assumption that it's short term memory at work during the process of identifying subtle but often significant differences.

There is no time limit to DBT. You can take however long you want. Heck, spend 3 months listening to A and then switch to B for another 3 months. It's still a DBT. You don't even need fancy equipment if you're going to be switching every 3 months.....just have someone change stuff on you without you knowing.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...