This is a continuation of the post below at:
http://www.audioasylum.com/forums/prophead/messages/2190.html
DBTs, ABX and the Meaning of Life? Part 2This is as much a continuation of Part 1 as it is a separate Part 2.
I covered some of the problems encountered in the real world amateur DBTs that have been conducted, but what about a well done DBT, how much resolving power can a good DBT reach?
There are several issues here that need to be addressed.
Obviously, the playback system needs to be capable of revealing any sonic differences that are being investigated. This does not necessarily mean that one should use the most expensive, or the most "SOTA" system available. Often, a system the listener/s are intimately familiar with can be more useful than one of theoretically greater resolving power. But even the familiar system needs to be inherently capable of revealing the subtleties of what is being searched for.
In lieu of achieving positive listening test results for the audio component being investigated, one can use other means to determine what the listening system CAN actually detect.
How would this be done? Well, we can use known thresholds of detection for such repeatable and controllable items such as THD, or amplitude changes.The strong suite of an ABX style test is detecting amplitude changes, and as reported at the ABX website, they were able to detect down to a 0.3 dB overall level change. If I am remembering correctly, jj has posted that he believes that with a really good test, one could get down to detecting a level of about 0.1 dB overall level change.
Or, you could determine what the sensitivity of the listening test set-up and the listeners was, to say, simple harmonic distortion. For instance, it is not too difficult to detect 2% 3rd harmonic distortion of a 1 kHz tone. The threshold is around 0.5% to 0.3%, depending on whose data you believe. So if a particular test came up empty during the audio component testing, and was determined to be unable to detect 1% 3rd HD at 1 kHz, and could only detect 0.5 dB overall level change, then there is some indication of the level of sensitivity of that particular test, and maybe it just wasn't sensitive enough to detect the more subtle aspects of music reproduction. The null results for the audio components may not have had very much meaning.
Note that if a test has been well designed and executed, and has none of the potential flaws or traps one can get into, so as to generate trivially false positives, then consistently achieving statistically significant positives for the audio component test is in and of itself, a form of validation of the sensitivity of the test. False positives can result from something as innocent as a very slight residual hum present only on one of the DUTs, and so, when asked to ID the X choice, the presence or absence of the hum would be a dead giveaway. Obviously, these kinds of problems would have to be addressed fully in order to conduct a valid test.
Failure to properly level match could also create such a problem, and there are other similar kinds of problems that could crop up, and need to be addressed.
All of the above concerns can be dealt with and the problems eliminated, but there is a more fundamental problem with many listening tests. A forced choice type of listening test also has a dichotomy between a casual listening mode and an analytical listening mode.
During an ABX style test, where the listeners were exposed to sighted listening of the components, and then were switched to blind listening, this would have involved a very casual listening atmosphere initially. So when they listen initially, they are listening casually, and without any special focus, perhaps believing they are indeed hearing the nominal differences between components, whether they actually are or not. The fact that they think they hear such differences, or that they claim to hear them, is often used as if it was some sort of major proof that they were listening attentively, when the exact opposite is probably true.
As I have conducted my own such tests, I am familiar with the foibles and problems of such tests. People will NOT listen attentively or focus on the sounds within the music unless you force them to, or direct their attention specifically to do so. Especially people without any experience or training in participating in such types of testing.
Now, when the listening session went from sighted to blind, and the listeners were asked to make a forced choice (identification of X is a forced choice, unless you allow for the option of the listener declaring that he could not reliably detect any differences, or that what he heard was identical. These are two different things BTW), the listening mode switched from casual listening to an analytical
mode. In order to decide and MAKE A DECISION that X is A, or X is B, or even that A and B are the same/can not tell, one must become analytical.Instead of listening casually and in what some would refer to as an emotional listening mode (left brain activity), the listener is now asked to listen in analytical mode (right brain activity). The situation has changed, the listening mode is no longer the same, and all the problems and pressures of forced choices rear their ugly head.
Oh, you can say that the listener can take all the time he wants, or that there is no pressure, etc. These are all empty claims if you haven't been in the hot seat. I have been there myself, and interviewed folks who have undergone forced choice listening tests of various types, and there is definitely a certain amount of stress and strain involved. If one is deliberately trying to maintain an analytical mind set, then the simple act of doing this
If the listener does take all the time he wants to, then he may be back in the emotional listening mode, and listening quite casually, but as soon as he needs to finally decide what X is or isn't, he enters back into the analytical mode. Even though there may be a long term exposure to the components using the more normal mode of listening, the act of making a choice (unless one truly does not care, and is merely guessing) MUST invoke a different frame of mind and a different mode of thinking.
Most people, in my experience, can not make this transition readily, whether back and forth, or just a single long term situation with a switch in modes at the very end, and maintain accuracy and consistency.
That is why I propose a more focused and attentive method of listening in my AES paper, I am trying to place the listener into an analytical mode for as much of the listening as possible, and then keep them there till they are asked to make the forced choice. I have found this to provide much more consistent results.
Long term listening is critical to the more subtle ear/brain training processes, and allows us to learn to hear the more subtle aspects even once we enter the analytical listening mode. Without the casual listening practice and exposure, we would find it much more difficult to hear such things analytically, in most cases, it becomes too difficult. Training the listener with training aids such as filtered music vs. unfiltered music, EQ'd music vs. flat response music, etc., will help overcome some of this, but not all of it.
Training is critical to achieving good results from a listening test, and this is where most of the old amateur ABX/DBT tests fell down miserably. The initial sighted portion of the test was all the 'training' that most of the listeners ever got. Even if such listeners were exposed to the forced choice of which unit X represents, it is very likely that it was only for a short while, and no attempt was made to go beyond a very cursory and simple familiarization process.
So with the majority of exposure to the listening situation having been via the initial sighted portion, the listener is thrown into the less familiar and different mind-set of analytical evaluation, rather than a more relaxed and casual listening for enjoyment scenario.
Ultimately, even with all the details of the test set-up and situation tended to properly, we are still left with this very fundamental problem of having to switch mind set during the listening tests.
Folks have used may different ways of objecting to this aspect, calling the listening test situation "unnatural" or "artificial" compared to listening to music for pleasure and aesthetic enjoyment.
Another aspect of controlled listening tests is the amount of stress and anxiety that the listening subject undergoes. There is no easy way to make such a test a relaxed and fun activity. You are demanding that people make decisions, requiring them to be analytical, and even with experience and training, the underlying stress is still there.
I would even say that if the listener was too relaxed and casual, that it would be very likely they just weren't focused enough to make any useful decisions about what unit X was. I think that this is one of the inherent problems with any forced choice listening test, and one that can not be wished away with admonitions to "just relax" or to "take your time".
Once more, I think that even if a particular test only provided a few threshold or JND type metrics, we still do not have a direct correlation between how sensitive the test would be for listening to a given component on music. If a given test fails to produce any statistically significant positive results, the it is still very hard to relate to just what might or might not have been heard.
What can a really good DBT resolve? In my own experiences, it is capable of resolving audio cable sonic differences, as well as those of CDPs, power amps, preamps and DACs. But it is not that easy to do a realy good test, not in terms of the set-up, or of the task the listening subjects are faced with. It is a LOT of work, and not a trivial task to be attempted on a whim, over the upcoming week-end.
In part 3, I will discuss and outline a simple set of recommendations for conducting a listening test of your own. Hopefully, we will avoid as many of the common problem as possible.
Jon Risch
This post is made possible by the generous support of people like you and our sponsors:
Topic - DBT, Part 2 - Jon Risch 20:02:20 04/20/03 (13)
- Re: DBT, Part 2 - Todd Krieger 08:34:03 04/21/03 (1)
- Etc - jj 21:16:47 04/21/03 (0)
- Re: DBT, Part 2 - LarryR 06:53:39 04/21/03 (2)
- Re: DBT, Part 2 - Jon Risch 21:25:09 04/21/03 (1)
- Thanks, Jon for the reply. Appreciate it. (nt) - LarryR 22:16:41 04/21/03 (0)
- Same O, Same O --- But at least you have something to say - jj 23:31:02 04/20/03 (7)
- Re: Same O, Same O --- But at least you have something to say - Jitter_by_Coffee 06:46:37 04/24/03 (3)
- When will you offer any substance? - jj 18:36:44 04/24/03 (2)
- Re: When will you offer any substance? - Jitter_by_Coffee 10:05:15 04/25/03 (1)
- Cool it! - Rod M 10:48:50 04/25/03 (0)
- Re: Same O, Same O --- But at least you have something to say - Jon Risch 21:19:52 04/21/03 (0)
- Re: which treasure should we not seek? nt - johnny a. 05:08:10 04/21/03 (1)
- You need to watch 'O Brother Where Art Thou', I think. - jj 21:15:41 04/21/03 (0)