Home Propeller Head Plaza

Technical and scientific discussion of amps, cables and other topics.

DBT, Part 3 (long)

This is a continuation of the post below at:
http://www.audioasylum.com/forums/prophead/messages/2190.html
and at:
http://www.audioasylum.com/forums/prophead/messages/2579.html

DBTs, ABX and the Meaning of Life? Part 3

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
How To Do An Audio Component Listening Test Comparison
By Jon Risch

There are several different basic concepts as to the best way to test audio components, the group that insists that DBT's ala the classic ABX methods are the way to go, and almost diametrically opposed, those who think that merely sitting down and listening to the just the component under consideration without any comparisons to anything else is enough. Somewhere in between is the time honored audiophile "A vs. B" test, usually sighted (you know the identity of the unit you are listening too).

If you want to do some testing, there are several things that you might want to consider. If you want to make an A vs. B comparison, you would want to set the volume control, and leave it for the duration of the test, or at least for each pair or triad of A vs. B tests. This assumes that the units have the same gain or output level. With most interconnects, this is the case, and for similarly gauged speaker cables, also OK. If you are testing a component such as a power amp, and it does not have an input level control, then some method would have to be used to rapidly ascertain and set the levels so they would be matched between amp swaps. This is best done electrically, checking the signals at the speakers off of the speaker cable.

You would want to repeat the same musical passage, so that you are listening to the same thing, instead of letting the music play on while swapping.
Limit the length of the musical passage to around 30 to 45 seconds, then swap cables as fast as possible, being sure to turn off the amp (or mute it), or to mute the preamp or temporarily switch to another input while the cables get swapped. You can test listen cables between an amp an preamp, or between a signal source (CDP) and the preamp/receiver. If the latter, then simply switching the input temporarily to another unused input swapping the cable, and switching the input back to the signal source, say a CD player, ALL WITHOUT TOUCHING
THE VOLUME CONTROL. For almost all interconnects, this will assure that the level is the same on playback, and even for most speaker cable comparisons.
Other devices that have gain controls should have the level matched, and for most electronically based components, a sine wave at 1 kHz will do the job.

Listening to audio components for an A vs. B comparison is not exactly the same as listening to music for pleasure. When you listen to music for pleasure, you try to get into the music, and ignore the equipment, while when you want to critically evaluate the equipment via an A/B, you must go at the task a little differently. For these kinds of test situations, it is important to try and focus on just one aspect of a particular song at a time. For example, if a particular cut has a wonderful spread of cymbals in one portion, and the drummer is really laying into them, then it would be a good idea to try and use this cut, and that particular portion, to evaluate the high frequency reproduction by using the cymbals to focus on.

Even if you are familiar with the musical selection, try practicing listening to the cymbals, and JUST the cymbals. Are all the hits the same sounding? (Hopefully not). Which ones are different, and in what way?
Do some sound very brassy, and others more frazzled? Do any of them leap out at you, or grab your attention when they come along? Listening several times in a row to the same portion of the same cut to determine exactly what it is you are going to focus on, can be very instructive.

Now, for the actual A vs. B comparison: listen to unit A for the allotted time, swap cables as fast as you can (a helper here is going to make a lot of difference, as if you can get someone else to do this for you, while you wait patiently, it will help keep your memory of the first cable clearer) and then play through the B cable for the proper cut and portion. Then after the portion is done, swap again back to A, and listen again.
This guards against a well known phenomena that causes us to focus a bit harder the second time we hear something, with the swap back to the A cable, we have a chance to reaffirm what we thought we heard as we swapped from A to B.
Again, for listening to components other than cables, use of an appropriate selector switch may be a valid way to go, as long as the different inputs are not electrically dissimilar or have a different gain.

Cymbals are not the only thing to latch onto, general percussion, a bass riff, a particular vocal, all of these will allow you to check and hone in on a particular portion of the audio spectrum, and a particular aspect of the test component. The main thing is to try and focus in on JUST that one aspect at a time, each time only on that aspect, as you swap back and forth. With a little practice and concentration, it should be possible for you to develop the ability to hear and focus in on just the cymbals or bass or vocals ONLY, and clearly enough to make some judgements about what might have changed, if anything.

If you do not hear any differences that you feel are definitive, then it may be that you are just not used to hearing any extra information that may be present. Cables are passive devices, all they can do is lose information or signals, or distort the signals. A better cable is one that is passing along the audio signal
with a minimum of losses or alterations. Other components can also be problematic if you are not used to their level of performance.

Lets think about what a cable could be modeled as. All cables have a built-in signal filter in them, and let us suppose that they have a grundge and frazzle generator built-in too. The better cables have this built-in filter set wider open in it's response to the signal, and has a lower amount of grundge and frazzle generator added. If you are used to hearing the cable with the filter that is real tight and with a smaller bandwidth, and has a certain level of grundge and frazzle, you may not appreciate a cable with a more wide open sound, or with less grundge and frazzle, because you are not used to hearing your songs this way. The other sounds you are used to hearing are still present, ad demanding your attention, so unless the previously unnoticed sounds are high enough in level to notice right away, or the grundge level has dropped enough to be immediately obvious, you may not be aware of the changes at first.
Similar concepts can apply to other audio components, they have bandwidth limitations, as well as added distortions and coloration’s.

After listening for several weeks, you may become more aware of that low level background vocal you never heard clearly before, or you may notice that the particular guitar riff had some slight string noise that you never noticed before. Once these kinds of things start happening, you might want to try that A vs. B
comparison again, and see if this time, after becoming used to a less restricting "filter", with less "grundge and frazzle generator" added, you might notice what your old cable or component was doing wrong a little more readily.

This is where the "listen for a long time before comparing components" advice comes from, and in some cases, it may be the only way to notice the differences.
On the other hand, the "grundge and frazzle generator" and the "filter" levels may have been so high on the old cable, that you might have noticed differences in clarity right away.

If all this seems like a lot of work, then don't even think about a DBT version of testing, and by the same token, if you don't go at least this far, then don't be surprised if you find it hard to hear an instant difference, as sometimes, the only way to hear things when you take the easy way out is to listen for a long time and let your ears become more experienced at hearing the wider open filters combined with the lower levels of grundge and frazzle.

This abridged version of a simple sighted listening method is not intended as a 100% iron-clad method or as a complete exposition of how to do it, merely as a beginning or starting point that is less fraught with pitfalls than some.

Note that by the simple expedient of using a helper to swap the cables, and having him ID the components by ONLY the designation "A" or "B", then this method of testing can become a simple blind listening test.

For more complete details, see AES preprint #3178, where I go through most of the whole process, complete with references. In this paper, I assume that the reader did have some basic knowledge of some of the popular versions of the ABX and DBT listening tests being conducted at the time.


Part 2, Listening Comparisons In More Detail.

Here is some information on how to listen during component listening comparisons, taken from my AES paper, preprint #3178, "A User Friendly Methodology for Subjective Listening tests". What earlier sections covered is summarized and condensed here:

Don't try to do too many listening trials at one sitting.

Listen to the same section of music, no longer than a minute, preferably about 45 seconds long, no shorter than 30 seconds. Repeat it for each set of trials exactly. When moving on to a new set of trials (ABA), use a fresh section of music, or rotate several different sections through so that you are not listening to the same musical segment over and over, it is too easy to become overly familiar and therefore bored with the selection if too many repeats occur in a short time frame.

CONCENTRATE while listening, as listening for comparison purposes is not the same as listening to music for pleasure. You must be in an analytical mode at all times. This will take some practice. Casual listening will not pick up on anything but major differences.

Use a set defined pattern of component switching. Instead of just switching back and forth between components being compared, listen to A, then B, then A. If doing a comparison that will not be reversible (or readily reversible), such as coating a CD with green ink around the edges, use an A, A, then B pattern, or in the case of the CD, listen to it twice, THEN coat it and listen again.

DO NOT keep switching back and forth, A, B, A, B, A, B, as this WILL lead to listening fatigue quite quickly. Listen just a few times VERY INTENTLY, and make them count. It is usually helpful to gently remind the listeners to FOCUS on the musical details before each trail begins, but after about the 10th time, they may want to smak you in the face!
Regardless, they DO need to focus and concentrate intently for the duration of each ABA or ABAX trial sequence Use the suggested method for listening to specific musical details below.

If you wish to make this a forced choice blind test, have the forced choice X at the end of the chosen sequence, e.g., A, B, A, X. OR A, A, B, X
Obviously, if you wish to perform the comparison blind, some assistance will be needed for swapping cables and keeping track of what unit is being presented.

Note that single blind conditions are met just by having the components being compared identified by only a label of "A" or "B". You (the listener) should not know at any given time, which actual unit is being called A, and which is called B.

Note that some warm up maybe helpful, if you are going to attempt the forced choice ABX type method, this could consist of just a few preliminary trial runs.

Folks who have never done this before should probably have their first set of trial runs thrown out, although there are a few who can pick up on it rather quickly.

For cable comparisons, do not use switchboxes, tape loops, etc, BUT swap cables or cables to the components. Keep the volume control the same, and switch temporarily to a dead source to avoid any switching transients when swapping cables, or for components that have the same gain/output level. For those that are not matched, set gain at 1 kHz, and match to within 0.1 dB, or approx. 1% in terms of voltage.

Finally, do not take notes or talk to others during the trails, wait till after the whole session is done, and you have made you choices or written down your notes. Focus on the task at hand during the musical segment playback, and take mental notes of what is going on.

This may seem like a harder thing to do than to actually discern differences, but the entire test sequence does not take that long, and it will help you concentrate on the task at hand.

AES PAPER EXCERPT (mostly word for word as presented in the paper)
**************************************************************
4.4 What To Listen For During Musical Test Passages

4.4.1 General

Focus on specific musical events within the musical segment, such as a cymbal crash or a specific phrase in the vocals, etc. Listen for different aspects at different points within the music, but try to limit your selection of specific items to be listened for a second time around. Beginners should limit themselves to just one or two musical events to remember from test run to test run, until practice has improved their audio memory and concentration. With practice and experience, 3 or 4 musical events can be examined from run to run.

4.4.2 Suggested Specifics To Listen For

Low Level Detail - Listen for the presence of minute details that are on the verge of getting lost in the midst of the rest of the music, such as subtle string noises, hall ambience, the breathing of the musicians, or even air conditioning noise recorded along with the music. These low level details are some of the first musical specifics to suffer with less than top quality equipment.

Transient Impact - Listen to the transient events in the music. Do they have a razor sharp sense of impact? An 'over before they are started' kind of effect? Or are they smeared and drawn-out in time? Live musical transient events have virtually no smear or blurring.

Bass and Treble Quality - listen to the quality of the bass and treble, not just how much of it there is, but how clear are the notes and sounds? Solid, tight bass notes with distinct pitch definition are in contrast to loose or boomy bass notes hard to pin down in pitch. There may be an apparent extension of low frequency response, a sense of musical foundation provided that is absent from the other component. Is the treble region sweet, clean and clearly delineated, or is it hard, hashy and distorted? An apparent extension of high frequency response typically doesn't sound like more high frequencies, but as though the music had an airy quality to it.

Stereo Spatial Phenomena - many subjective listening tests will be performed in stereo (or perhaps more appropriate: 2 channel reproduction) and therefore will include some spatial or pseudo spatial information. Theoretically, if both channels are changed by a component in the same way, then very little effect should be noted on true stereo spatial information. However, due to the fact that much of modern music has it's 'stereo' generated artificially in the studio, based on some rather crude level, phase, and time manipulations. It takes rather less deviation from linearity to disturb something even as seemingly solid as the pseudo-monophonic image generated in such manner. Listen for image shifts of back-up vocals, and shifts in the apparent position of a specific instrument from component to component, as well as shifts in the overall soundstage character of the musical passage.

Overall Tonal Balance - This has traditionally been the most questionable of changes to listen for, as a 'simple' linear error, such as a minor frequency response difference, can be responsible for the difference heard. It is, however, the most powerful and useful change to listen for when there is every reason to believe that the frequency response has not been changed to a significant degree within the audio band. Evaluation of most IC chips (those suitable for audio use according to their specifications) would be a good example of this type of situation: substitution of one IC chip for another in an audio component, say the output stage of a CD player, normally will not alter the frequency response significantly. If a change in tonal balance occurs when switching the IC chips in and out of the unit, then it is very likely that the tonal change is due to some difference in the signal handling accuracy other than frequency response errors. Of course, the frequency response should be checked to verify that it does measure 'flat'.

Another example would be listening to interconnect cables. In most modern sound systems, the substitution of one grade or type of line level interconnect cable will not significantly affect the measured frequency response within the audio band, so that if tonal balance changes are heard, they are most likely to be due to some other factor associated with the cable itself, for instance, the dielectric effects of the insulators.
It must be pointed out that there are always 'special' cases where the above assumption will not hold, such as an IC chip not truly suited for audio use that actually does affect the audio band frequency response, or interconnects used with equipment with a very high output impedance (some tube gear) that will roll-off the high frequencies with certain high-capacitance cables.

******************************
END OF AES PAPER EXCERPT
Copyright Jon M. Risch, 1991 All rights reserved.


Conditions and procedures not mentioned specifically in my paper:

Double Blind equivalent conditions can be maintained by having the cable swapper behind a screen consisting of a very minimal lightweight framework with speaker grille cloth stretched over it, and hiding his actions at the system where he is swapping cables. No talking or communications are allowed, other than to identify A or B in the first part of the sequence, and this can be done via visual means by flipping an A or B card up into view on the top/side of the framework.

The cable swapper is providing an ABAX or AABX sequence, with which unit X is determined by coin tosses, and recorded on the spot by the swapper. The listener's responses to which unit X is are also recorded, and then at the end of the run, they are compared and filed.

I now recommend a musical segment to be repeated of about 45 seconds to 1 minute in length, rather than 1 minute to 2 minutes.

Some other aspects of testing that will need to be addressed, some are covered in my paper, but I wanted to spell them out here:

1. Test to be done in stereo. This may seem obvious, but some try to use a stereo to test two components by swapping back and forth between channels, a method that is fraught with many problems.

2. No switchbox or use of tape loops etc, use cable swaps. I insist on this, and it has been shown over a long period of my testing to increase resolving power and to avoid unrealistic cable loading issues. If the cable is seeing a different electrical environment than what it would under normal use, this will obviously add confusion and extra variables to the listening situation. This includes tape loops, etc.

3. Training, using EQ inserted into system and then bypassed/not bypassed with just a tad of EQ dialed in, and use harmonic distortion CD-R's with referenced levels of various amounts of HD, also CD-R's with level changes on separate tracks.

4. Establish test subject sensitivity via harmonic distortion and amplitude changes, recorded on CD-R's.
This tests both the listener's and the system, and provides at least a baseline of known sensitivity. To my knowledge, this has never been done for any of the anecdotal listening tests on audio cables or other audio components that came up empty, that is, resulted in a null.

5. Must establish suitably demanding musical source material, and burn a CD-R.

6. Some vinyl listening, not all CD. No MP3's or Mini-Disc's as source material.

7. Length of the musical passage to be repeated for each swap is now from a minimum of 30 seconds to a maximum of about 45 seconds. This helps assure that the listener stays focussed, by giving them a definite time limit and telling them that they must stay attentive and concentrate for that long. It is still very
hard work, and difficult to do well for very many trials.

Repeat that exact segment for each set of trials exactly. When moving on to a new set of trials (ABA), use a fresh section of music, or rotate several different sections through so that you are not listening to the same musical segment over and over and over and over.
While doing this on very short musical segments may be effective for codec testing, it has not been found to be that effective for cable-swap based regular component testing.

8. DO NOT keep switching back and forth, A, B, A, B, A, B, as this WILL lead to listening fatigue quite quickly. Listen just a few times VERY INTENTLY, and make them count. This is the reasoning behind the ABA sequence, or the ABAX sequence.

9. If you wish to make this a blind test, have the forced choice X at the end of the chosen sequence, e.g., A, B, A, X. OR A, A, B, X

10. Have some able bodied assistants that are not participating in the test to make the cable swaps without turning the equipment off (if possible). For interconnects, this can be done by switching the source selector to an inactive source while swapping cables, and this also helps keep the volume set to exactly
the same place.
These folks work behind an acoustically transparent screen consisting of a 1X2 frame with black grille cloth stretched over it, to block the listener's view of what they are doing. If this is done, and the cable swappers do not talk to the listeners, then Double Blind equivalent conditions can be maintained.

11. Do not take notes or talk to others during the trails, wait till after the whole session is done, and you have made you choices or written down your notes. Focus on the task at hand during the musical segment playback, and take mental notes of what is going on.

12. Do not try to do too many trials at a given sitting, that is, do not try to do 16 different ABA or ABAX trials at one sitting. 16 trials is the traditional number of trials for the ABX method, and the criteria was to get at least 12 correct.
Doing 16 trials at one sitting of intense concentration will almost inevitably lead to listening fatigue. This is, after all, 64 separate runs of the music using my method for a ABX type run, and usually many more for the classic ABX type listening tests, where the listener was encouraged to swap back and forth as much as they wanted.
The reasoning behind these numbers was to achieve a p<0.05, or a probability of less than 5% that the results were due to sheer chance. While this is a widely accepted value to use for p, it is not the only one that can be used, and is not the only one that would provide for a practical amount of validity.

I found that much smaller numbers of trials that reach an approximate confidence level to p of about 0.1 are still quite useful. This provides around a 10% chance that the results could have been due to sheer chance or guessing, and if the positive results can be repeated more than once in less than 10 sets of trials, then it is a very good indicator of valid results.
This level would be getting 5 out of 6 right, 7 out of 9, or 8 out of 11.

I would not try to go above 11 trials per sitting, as fatigue sets in too quickly.

Note that when 16 trials is used, the listener may have become fatigued half way through the run, and is now just generating random results for the last half of the test.

Part 3, Analytical mode vs. Casual Mode

RE ABX group listening tests, specifically, the 'auditions' where the listeners were initially exposed to sighted listening of the components, and then were switched to blind listening, this would have involved a very casual listening atmosphere initially. There is no doubt that humans are predisposed to hear what they think they hear. I won't argue that point, it is incontrovertible.

So when they listen initially, they are listening casually, and without any special focus, perhaps believing they are indeed hearing the nominal differences between components, whether they actually are or not. The fact that they think they hear such differences, or that they claim to hear them is often used as if it was some sort of major proof that they were listening attentively, when the exact opposite is probably true.

As I have conducted my own listening tests, I am familiar with the foibles and problems of such tests. People will NOT listen attentively or focus on the sounds within the music unless you force them to, or direct their attention specifically to do so. Especially people without any experience or training in participating in such types of testing.

This means that even though the people listening to the components sighted THOUGHT they were clearly hearing component differences, they may not have been. There is no possible way to analyze or judge this initial sighted session, as there is no way to assure that the listeners actually heard what they think they
heard. As long as the first part of the 'audition' was sighted and untested, all that can be said is that the listeners seemed to think they were hearing the customary differences they were accustomed to hearing on their own systems, etc.

Now, when the listening session went from sighted to blind, and the listeners were asked to make a forced choice (Identification of X is a forced choice, unless you allow for the option of the listener declaring that he could not reliably detect any differences, or that what he heard was identical. These are two different things BTW), the listening mode switched from casual listening to an analytical
mode. In order to decide and MAKE A DECISION that X is A, or X is B, or even that A and B are the same/can not tell, one must become analytical.

So now, instead of listening casually and in what some would refer to as an emotional listening mode (left brain activity), the listener is now asked to listen in analytical mode (right brain activity). Whatever the listener might have actually, or just thought he heard in the first sighted session, will now have no bearing
on what may or may not be heard in the blind session. The rules have changed, the listening is no longer the same, and all the problems and pressures of forced choices rear their ugly head.

Oh, you can say that the listener can take all the time he wants, or that there is no pressure, etc. These are all empty claims without any backup. I have never seen any references provided for these claims, nor for the claims that rapid switching is inherently better than playing the same musical segment over when
switching to the other component.

If the listener does take all the time he wants to, then he may be back in the emotional listening mode, and listening quite casually, but as soon as he needs to finally decide what X is or isn't, he enters back into the analytical mode. Even though there may be a long-term exposure to the components using the more normal mode of listening, the act of making a choice (unless one truly does not care, and is merely guessing) MUST invoke a different frame of mind and a different mode of thinking.

Most people, in my experience, can not make this transition readily, whether back and forth, or just a single long-term situation with a switch in modes at the very end, and maintain consistency.

That is why I propose a more focused and attentive method of listening in my AES paper, I am trying to place the listener into an analytical mode for as much of the listening as possible, and then staying there to make the forced choice. Much more consistent results are then obtained.

Long term listening is critical to the more subtle ear/brain training processes, and allow us to learn to hear the more subtle aspects even once we enter the analytical listening mode. Without the casual listening practice and exposure, we would find it much more difficult to hear such things analytically, in most cases, it becomes too difficult. Training with such aids as I mention will help overcome some of this, but not all of it.

Note that this is NOT meant to be an end all ultimate way to do these kinds of tests. It is presented merely as a starting point for a more controlled method than the typical A vs. B audiophile type listening tests, and you can make it as involved, or as casual as you want, with the understanding that it will have whatever inherent limitations there are for that level of involvement.

Jon Risch



Jon Risch


This post is made possible by the generous support of people like you and our sponsors:
  Sonic Craft  


Topic - DBT, Part 3 (long) - Jon Risch 20:20:39 04/20/03 (71)


You can not post to an archived thread.