![]() ![]() |
Audio Asylum Thread Printer Get a view of an entire thread on one page |
For Sale Ads |
The idea is to prepare CD-R's with three tracks A, B, and X to see whether or not the effects of, for instance, different interconnects can be heard.Sending the CD's to the participants allows for long-term listening in their home system, free from stress that might occur in a test that uses an unknown system, free from stress that might occur when only a limited amount of time is available. I would, however, ask the participants to set a time limit of 2 months after receipt of the CD, that should be enough.
If you think that such a test is interesting to do, send me a mail. I would limit the test to 15 inmates. Should there be more than 15 inmates showing their interest, I will put the names into a hat and draw 15 names blindfolded.
I would also like to have a volunteer for the role of the "umpire". I would send the results of the test to him such that the outcome can be X-checked. The identity of the umpire would not be revealed before the test is finished. Should there be more than 1 volunteer, I would again draw one name from the hat.The results would then be posted on the asylum, together with the monnikers of the participants.
Any questions or suggestions ?
Follow Ups:
Seriously. Whatever you end up showing will be staunchly refuted by the "losing" side. Remember, many folks who believe fervently in voodoo science are religious about it. Can't prove them wrong. The same goes for folks who can't/won't hear the differences between electronic components with similar specs.
![]()
If your test is for analog interconnects then I would propose that the test is unnecessary for 2 reasons. 1) There is an audible difference between most cables but the effect is rather subtle compared to say changing the speakers or the preamp and amp. 2) The effect is unpredictable from system to system and you are only recording what your system does with these cables and not what other peoples systems will do with them. If other peoples systems resolve this differences or not is irrelevant because what is recorded still ONLY reflects what the cables did in YOUR system. It is kind of funny to hear people on this asylum obsessing about some very expensive cables when all they really are is the hardcore enthusiasts TONE CONTROLS! Unpredictable ones at that. Yet these same people would rather die than use and eq or room correction device. So instead they "voice" their systems with interconnects and speaker cables. This is equalization and the hifi business has just gotten craftier about selling it to you and making even more money than a good eq (Nordost Valhalla costs at least as much as a TACT!!). Personally, I find equlaization to be a good thing but why not use a real one (like the Z-systems RDP or the TACT RCS 2.2) and get MAJOR benefits not this "Oh, I hear a little bit more air in the highs" or "This one is a bit darker" etc. Count me out. Now when you want to do a comparison of speakers or perhaps phono cartridges I might be interested. These things truly affect the sound and there is no easy way around them like there is with cables.
![]()
What you seem to be proposing is a VERY limited and low reliability test scenario.A single cut on a CD-R, sent to only 15 people?
This would only provide a set of 15 single trial test runs, which is niot even enough for warm-up on a well-designed test.
Aside from the very low number of test trials (15), and the extremely small number of trials per subject (1), there are a large number of significant problems with the proposed method.
1. How will the source material be deemed to be suitable for showing up cable differences, especially between two particular cables, to a wide range of different people, all with different systems?
If you select the music to be used, and you master the CD's, how will anyone else know that the CD-R's would have the inherent capability to reveal any specific set of cable pairing differences?
Since you yourself have not been able to detect said differences, and have even gone so far as insist that you can not hear absolute polarity, there is strong evidence that your capacity to distinguish such subtle sonic differences is not too good.
Hanging the test results on just one musical selection of your choosing, and depending on you for the final quality of the CD-R, and it's inherent resolving power, will not suffice. There MUST be independant confirmation of the CD-R's quality, and in the very least, that experienced and trained listeners can actually seem to hear a difference between A and B under sighted conditions on high resolution systems or their own system.
I would suggest that any subject that was sent a CD-R, and could not initially tell a differnce in sound between A and B, stop right there, and go no further, as there is in essence, no initial "claim" to hear a difference in the firstr place. This could easily be due to one of several things, all unrelated to whether or not folks can actually hear the difference between cable A and cable B. It could be due to a lack of inherent resolving power of the subjects system, lack of sufficient recording quality, or a lack of experience on the part of the subject with listening tests in general, and specificaly, an ABX scenario.
2. What kind of ABX/DBT training will the subjects be given? What instructions on how to listen and what to do?
Experienced DBT administrators will tell you that training is a large part of any blind listening test, and without it, the test results are very likely to end up a worthless null result.
Again, at the least, a complete and proper set of instructions would be needed, and they would have to be made available for public inspection and critique, or common mistakes might not be avoided, providing yet another opportunity for a completely worthlerss null.
3. One trial for each listening subject is an absurd procedure. Even if the very best compromise "universal" music were somehow determined, and each listening subject is an experienced listener, and has been exposed to ABX procedures, etc., having only one trial per person is a waste of their time, and of anyone who would care to follow the 'results' of the test.
I would say that each subject should get at least 9 different musical selections, and the A, B and X version of these tracks present, for a total of 27 individual tracks.
I cover more of the details of what I feel are the best length's for the tracks in posts I made in Prop Heads some time ago, see the URLs at the bottom of this post for details.
4. I would include as many subjects as possible (to the limits of the number of people volunteering), and only limit the numbers to below say 35 or 50 people, in order to get as much data as possible. I realize that this means that the "umpire" will now have to correlate 35 or 50 times 9 sets of data, BUT, if you get more than two volunteer umpires, split up how many folks they each have to process, and let them get together after they have documented their data in order to present it.
Unless you, and the subjects and umpires, are willing to do this kind of a test well enough to have any sort of significant scientifc meaning, it just becomes another half-hearted null result with absolutely no meaning at all, and certainly not the meaning that some might tend toward if an overly simplistic test consisting of 15 single trial that came up with null results, and was promptly accepted as yet another 'scientific' negative result by you and others.If you are not going to do it well, don't bother to do it at all, as it is just misleading and bad science.
For more on DBT/ABX testing flaws, and common mistakes, see:
http://www.audioasylum.com/forums/prophead/messages/2190.html
and at:
http://www.audioasylum.com/forums/prophead/messages/2579.html
and
http://www.audioasylum.com/forums/prophead/messages/2580.html
The only problem with the test methodology is the use of different wires for the various CDR selections.In fact, all CDR selections should be recorded using the same wires so the test will reveal how often typical golden ears imagine differences among wires!
The current test methodology would only be a faulty test of the CDR medium's ability to capture wire sound quality differences ...
"faulty" because wire sound quality differences are only an
unproven theory (Risch bait).
![]()
The CD will contain several sets of two-track samples where the participant has to decide if the source (or the cable for that matter) is the same or not.
You are really at a loss aren't you Richard?Your bait stinks.
1. I would not record only one track in ABX, but more. I would use pink noise, single acoustical instruments, complex passages from pop, jazz and classical, both from CD and LP. I would further use CD and MD as source to see whether or not the MD data reduction algorithm results in audible differences (for MD and CD comparison levels would be matched by adjusting the preamp inputs). I would try and use digital and analog interconnects. I would use full data and reduced data recording from my PC. I would put some tracks (2 tracks only) with the question "same or different". I would, if my preamp allows (have to ckeck with the designer), record with inverted polarity. I would try and get a high-end CD player (Wadia) to compare against a Sony walkman.The tracks will be as usually can be found on CDs with the normal access functions.
2. People claim to hear differences between cables no matter what system they are using. I can't see a reason why my ABX should fail. These same people are by no means experienced and trained listeners. On the contrary, the Joe Sixpacks of this world are perfectly capable of distinguishing two different cables, at least that's what is becoming apparent when you read this and other forums.
3. What would be the technical reasons for the incapability of the CDs to reveal sonic differences ? If CD quality is really an issue I can use special audio CD-R's (like from HHB). The recording will be done with my Tascam professional recorder using 1x speed.
4. There seems to exist the à-priori that cables make a difference and when A and B can not be distinguished, that the test is flawed. You say that when the test subjects don't distinguish between A and B that they should stop. This, however, could also mean that no difference between A and B exists !
5. The subjects don't need training, the "audiophilus communis" who hears differences between cables did not have any training either. Instructions will be very simple : you have A and B and X, tell me whether X is A or whether X is B ! Take your time, relax, you have 2 months (or 3) to finish the test.
6. Since I'm paying for all this (CD and shipping) I'm not going to send dozens of CD around the world. 15 people checking out some 10 (or more) sets of tracks should be enough. The role of the umpire would merely be to assure that I can't mess around with the results. When having finished the whole exercise I will post the solution to the question " what is X", and the umpire can check.
7. My test does not use an ABX switchbox, it is perfectly double-blind, it does not put the subjects under undue stress.
[ 2. People claim to hear differences between cables no matter what system they are using. I can't see a reason why my ABX should fail. These same people are by no means experienced and trained listeners. On the contrary, the Joe Sixpacks of this world are perfectly capable of distinguishing two different cables, at least that's what is becoming apparent when you read this and other forums. ]This still begs the question. If you can not possibly conceive of how your listening test can fail, then I suggest that you are not trying very hard to avoid the many known flaws that exist in past ABX and other types of DBT tests. Those who have never participated in a "formal" listening test, have no experience with the test situation, they are used to listening casually, and without benefit of training, or at least exposure to the ABX method, will NOT be able to resolve the same kinds of differences they can normally hear under casual A-B comparisons.
I strongly suggest you read the previously referenced URLs provided in order to more fully understand the issue.
[ 3. What would be the technical reasons for the incapability of the CDs to reveal sonic differences ? If CD quality is really an issue I can use special audio CD-R's (like from HHB). The recording will be done with my Tascam professional recorder using 1x speed. ]
As a matter of course, every effort should be made to assure the highest quality level, as well as selection of suitable program material. I am NOT saying that the redbook CD format is not capable of allow such discrimination, just that there ARE quality issues involved. The question still remains: how will you know that YOUR CD-R has the inherent capability to allow folks who actually can hear cable differences, to detect them if you yourself have not been able to do so? Who will check for this, other than sending out a potentially flawed CD-R? If the CD-R was somehoe compromised, how would YOU know, and how would anyone else know until it was sent out and there were 15 nulls? (Which would still tell us absolutely nothing.)
As a concrete example, if the CD-R wwere recorded with lots of jitter, then the disc would probably be useless as a test signal source, the jitter artifacts would tend to cover up the cable sonics.
[ 4. There seems to exist the à-priori that cables make a difference and when A and B can not be distinguished, that the test is flawed. ]
And your point is?
I have conducted controlled listening tests, using this method:
AES preprint #3178, "A User Friendly Methodology for Subjective Listening Tests", presented at the 91st AES convention, October, 1991.I have achieved numerous strong positives, with a tremendous amount of attention paid to assuring that the test was conducted as cleanly and correctly as possible.
I _KNOW_ cables sound different.
Also see:
How to listen, excerpt from my AES paper:
http://www.AudioAsylum.com/audio/cables/messages/4321.htmlI strongly suggest you read the previously referenced URLs provided in order to more fully understand the issue.
Or are you just trying to jerk peoples chain?
[ You say that when the test subjects don't distinguish between A and B that they should stop. This, however, could also mean that no difference between A and B exists ! ]
It could mean this, but scientifically, it would NOT provide any real proof that this was the case, only that there could have been any one or more of a number of fatal flaws or problems with the test/procedure, etc.
Forcing people to continue, when they admit up front that they can not hear anything on the CD-R you provided, is pointless and futile.
It does NOT mean that they really can not hear any cable differences.
[ 5. The subjects don't need training, the "audiophilus communis" who hears differences between cables did not have any training either. Instructions will be very simple : you have A and B and X, tell me whether X is A or whether X is B !
Take your time, relax, you have 2 months (or 3) to finish the test. ]Training is necessary, because listening for test purposes is different than listening for pleasure.
Perhaps they will stumble upon an appropriate method to listen, perhaps you will limit the track length adequately, perhaps the musical selections will be adequate, maybe, but why leave all that to chance and happenstance? YOU CAN NOT MAKE A FORCED CHOICE (DECIDING WHICH ONE X IS) WITHOUT MAKING A DECISION. This is NOT how we normally listen.
I strongly suggest you read the previously referenced URLs provided in order to more fully understand the issue.
[ 7. My test does not use an ABX switchbox, it is perfectly double-blind, it does not put the subjects under undue stress. ]
I strongly suggest you read the previously referenced URLs provided in order to more fully understand the issue.
Just because a test is double blind, does not automatically eliminate any other flaws or problems, it is merely one aspect of the test scenario.
As for the cost, that is not an excuse, as the cost to ship 15 CD's vs. 30 is not that much of a factor. Especially if you stay within the US mainland (or your local area) for the "extra" CD's.
Either you are serious about this, or not.As for the number of tracks, you did not say anything about that until your reply to me.
I would not scatter shot the tests either, make ALL the tracks cable tests, and not try to detect several other issues as well.
I would not use MD or MP3, why make this a test of compression algorithms, when the real issue is cables?
If you do not focus the test, you will have such a low number of trials for any given issue, any attempt to draw conclusions would be a mistake.Performing a lousy test in an attempt to validate your world view on cables (or polarity, or tweaks, etc.) will not further true science.
It might make you feel better, but will not resolve the issue any further in reality.
Jon Risch
![]()
1. Audiophiles hear differences, subtle or huge, between different cables (and other stuff) WITHOUT having been trained. Why on earth is training required for the ABX ???2. Audiophiles do decide whether or not they hear a difference when doing sighted A-B listening. they don't seem to be stressed by that situation. When it comes to make a decision in a blind test, all of a sudden they claim to be under stress. Strange indeed.
3. I'm shipping the CD from Europe, so costs will be higher.
4. Compression algorithms are said to have audible effects, so I thought we could also have a look into this. It's not only cables, I could as well use different brands of MDs (which are said by audio press to sound different), I could record onto CD-RW first, apply green-marker (or not) and make copies to CD-R.
5. Thank you, I don't need to "feel better". My view on audio is clear and un-obstructed. But, instead of bringing lots of criticsm forward, why don't you participate and show us
1. that the recording process and CD itself is of too low a quality to reveal "clearly audible differences"
or
2. that you are able to achieve a 100% score. This all reminds me of the reproaches expressed towards Stan Lipshitz "you don't believe but you don't want to listen yourself in order to see whether you're right or wrong".
You could also contribute to this test by telling me what music selection is deemed to be appropriate. I have the intention to use the EBU SQAM CD as one source, please look at the content
http://www.ebu.ch/tech_t3253.pdf
and let us (me) know which tracks would be acceptable for this purpose.
A blind test increases stress because it forces you to listen strictly based on the sound without the comfort of your pre-conceived ideas.I wonder how many of us would have a different system if it was selected in a blind test?
![]()
It is pretty clear now that you have no intention actually trying to perform a high resolution test.The simple fact that you still do not have a clue regrding training, why casual listening is not the same as forced choice test listening, that stress and abnormal conditions occur during any forced choice scenario, or that muddying the waters with more than one test scenario per disc is not a good idea, is the proof of this.
I do not need to participate, I have my years of study, hundreds of controlled listening tests, and years of experience with said tests to go on. I have no desire to provide any sort of legitimacy to your non-test test, by participating. The fact that you are ignoring known and proven problems with DBTs is enough for me to give up now.
I recommend that no one waste their time participating at all.
Magnetar
![]()
![]()
Jon,I think you spent too much time on this thread. This type of testing can only be "statiscal" valid if it is properly "experimentally designed". Experimental Design is only one of many subject in Statictics and is far beyond ABX can do.
Paul Lam
P.L.C.Lam Consulting Inc.
I understand that you would be sending out one CD with three tracks recorded using three different interconnects. Sounds like fun. Count me in.
Sincerely,
Bob Samuelson
![]()
"sounds like fun" is exactly the right idea here.Too many of people over analyzing everything. Let's just try it and have fun. Let Klaus do his thing.
This approach to a test will only work if certain requirements are met.First, the recording process doesn't eliminate significant differences in the results with each cable. Take volume/level for example. If one IC has significantly different resistance to another IC, one would expect to hear a difference in volume when one made the substitution. Using the ICs as part of the recording chain, the level going to the recorder should be higher with one than the other. If you adjust the recording level to match input levels for both ICs, then you have removed part of the difference, just as you would remove part of the difference if you adjusted the listening level in the first instance so that the listener heard matched levels.
How do you propose to make the recording in a way that ensures that no significant differences are lost? You can't match recording levels because the signal level coming through the IC is one of the things that could distinguish it from other ICs. You also need to ensure that there's nothing in your recorder that accentuates the characteristics of one IC while diminishing those of the other.
Next, on the test design itself. Why ABX unless you're making a different CD for each individual? My understanding of ABX is that the X means that on some changes, no substitution is made, so sometimes the listener gets two As in a row and sometimes two Bs. Different listeners actually get presented with a different play order and the administrator has no idea of what the play order presented to the listener is. With 3 tracks you're only going to get 2 identical tracks in a row on one occasion, a very different situation to a proper ABX test where there are many different presentations with several duplications along the way.
Are you going to create different CDs for each listener, with a different track order and duplication, so one person gets AAB, one ABA, another BAA, another ABB, another BBA, and yet another BAB in order to cover all 6 possibilities of play order with only 3 tracks? That's the sort of detail in the test process that is needed to guarantee that the presentations are random enough to ensure that the play order isn't influencing outcomes. Of course you'd probably also need to up the number of participants so a reasonable number of people got each presentation.
That then means that you need to keep a list of who gets which disc so that the results can be properly compiled. If you receive the results directly and pass them on to the umpire, there's no guarantee that things don't get altered in the process. You need a way of ensuring that you, as administrator, don't influence the result. You could do that by having people send results directly to the 'umpire' who compiles and analyses them, or ask people to send results to both you and the 'umpire' who both separately compile and analyse and either present your results individually or in a joint presentation. Given the nature of the net as a posting method, I'd prefer to see individual reporting by both you and the 'umpire'. Either way, the umpire needs to have their own copy of the list of who gets which CD.
If you aren't prepared to go to that sort of length, then why not a simple AB test with half the participants getting a disc with an AB play order and the other half getting a disc with a BA play order?
I'm interested, but only if I'm satisfied that the test methodology is reasonable. On the basis of your 3 track ABX suggestion as presented in your post, I have grave doubts that will be the case. I think you need to do a lot more work on your test design first, and be prepared to give a detailed account of what the test design is when you ask for volunteers. Since you're the one who regards blind testing as so much more reliable, you'll have to excuse me for demanding the utmost stringency in test design and procedure. This is only worth doing if it's worth doing extremely well.
1. I would not adjust the recording levels for the interconnect tracks, I however would adjust level when using CD and MD as source (pink noise from EBU SQAM)2. If my recorder accentuates one cable w.r.t. the other, this would be beyond my knowlegde and beyond my control. If this is technically possible, I would like to have a technical explanation.
3. I would record several sets of tracks, not only one.
4. All CD's will be identical. The participants will, however, not know who else is participating (unless, obviously, they ask all of the inmates who declared their interest in the test). This ensures that no participant can communicate with the others during the test.
5. I will ask the participants to communicate their results to myself and the umpire. The identity of the umpire, however, wiill have to be kept secret until everybody has finished. Also, the umpire will not know who is participating as to avoid possible interfering communication.
6. For possible statistical analysis of the results I would need help of an expert inmate.
Proper tests use many more than 3 presentations. The reason for that is that the listener is presented with 4 different changes at various stages:A followed by B
A followed by A
B followed by A
B followed by BWhat this does is allow for the fact that the order of presentation may make a difference, so it may be easier or harder to detect a change when B follows A than it is when A follows B. It also gives instances when there is no change with both of the 2 options available. Finally, each of those 4 presentations is presented several times because, if the differences are subtle and people really are discriminating something that is close to the audibility threshold for the difference, results can be variable.
Even using a variety of different material, presenting each sample only 3 times with one sample repeated twice simply does not stack up to professional test standards. Only 2 transitions are represented - track 1 to track 2, and track 2 to track 3, one of which is one of the two changes possible and the other of which is one of the two non-changes possible. By not presenting the full range of presentations possible, you degrade the validity of the test and by only presenting each of your 2 transitions once, you degrade the validity again in a different way. This is simply not a good test approach.
You are also going to have problems with selection bias on this test because you are asking for volunteers and a small number at that. You have said in the past that tweakers have a tendency to hear what they want to hear, but that tendency is not unique to tweakers People who don't believe there are differences are equally prone to hearing what they want to hear, ie no difference in their case. You have no way of ensuring that your sample is unbiased and a sample size of 15 is simply too small, especially given the fact that you are excessively reducing the transitions presented. Having a strong proportion of people who believe that there are no audible changes and who simply report 'no difference' to each change would distort the test result excessively.
Finally, what do you think you are going to prove by this. You are not going to prove anything about whether or not there is an audible difference. There are simply too few presentations and too few subjects to guarantee a result. In fact, it is probably impossible to show that a difference can be heard with this test design. The difference would need to be night and day to get results that would satisfy statistical requirements. The harder the difference is to show, the larger the test sample needs to be and the more presentations are required. You need really big tests to if really small differences are to be demonstrated.
Bear in mind that for something to be audible, all that is really required is for 1 person to be able to hear it reliably, ie to accurately say whether or not there is a difference every time they are presented with a transition. Why then do we need studies of the sort under discussion. Simply because no single individual is that reliable in relation to differences that are close to the limits of perception so it becomes more a question of 'can we hear it more often than not' rather than 'can we hear it every time' or 'does everybody hear it', and that is a very different sort of thing to telling the difference between red and green where you know there is a problem if the person doesn't get it right. There is a 'grey area' between the clear cut differences which everyone hears under normal circumstances unless they have a hearing impairment, and the opposite situation where the difference between the two things is genuinely so small that absolutely no-one can ever hear it. Not everyone has the same level of hearing acuity so as genuine differences become more subtle, fewer and fewer people hear them until eventually no one can hear anything. The closer you get to the point where no one can hear it, the bigger the test needs to be if you really want to stand a chance of demonstrating that some people can hear it. If cables really do make a difference, then that difference is in the 'grey area' because not everyone hears it and not all of those who do hear it hear it with every cable change so your small study simply can't measure up to the standards required. If it can't possibly show a difference exists because the sample size isn't large enough for validity, then the test is simply useless.
Of course, if you only want to prove that most people can't hear the difference, it's a lot easier but you can't use a test which does that as a basis for claiming that no one can hear it.
Your test as described is simply too limited and your sample size far too small to be capable of demonstrating that people can hear anything other than a 'night and day' sort of difference. Don't believe me on face value on this - go and talk to a statistician or someone involved in hearing research and tell them you want to test for something where there is genuine disagreement about whether or not there is an audible difference. They will tell you the same thing, but at least they will be your chosen expert so you should be more prepared to trust them rather than me on this.
1. I agree that for the A-B samples both two orders of presentation are necessary, A-B and B-A. The participants are perfectly able to repeat the sets of samples as often as they wish.2. I was not aware that the order of presentation A-B-A was flawed in itself. At least, that argument has never, to my knowledge, been presented when discussing that method.
3. No listening test can be sure to have a good mix of believers and non-believers, unless you select the participants using that parameter. In this particular test, I could ask the ones who are goig to participate whether they consider themselves believers or not.
4. I know that such a test will not prove anything, but it is able to add evidence to the discussion. If there are valid technical reasons that do show that this test is not reliable, and I'm still waiting for such reasons, then we'd better stop.
5. Of course I'm not in a position to verify if the participants are able to obtain consistent results, which would qualify them as test subjects. But the common audiophile is not verifying his consistency either when judging audio gear.
6. I perfectly know that the sample size might be too small to give reliable results. One more reason for you to participate :-)
1. I agree that for the A-B samples both two orders of presentation are necessary, A-B and B-A. The participants are perfectly able to repeat the sets of samples as often as they wish.
You miss the point. In a proper blind test, neither the listener nor the administrator knows the order of what is being presented. The use of the X in ABX - the random insertion of no change transitions, jumbles the order and neither the listener nor the administrator knows after the first presentation - the A presenetation - whether they are listening to A or B. You're suggesting introducing the ability for the listener to be able to know whether they're listening to track 1, 2 or 3 and to compare them at will - a totally different set of conditions to a blind test.
2. I was not aware that the order of presentation A-B-A was flawed in itself. At least, that argument has never, to my knowledge, been presented when discussing that method.
Once again you miss the point, First, are you saying that you are just going to present the 2 options in the order ABA? That is not what ABX does and you're calling this an ABX test, which it simply isn't. Secondly, if people know the order, it isn't blind. The idea is that the subject doens't know what the transition is from or two. That includes being able to identify which are the different tracks. Using an ABA order and stating it as you have just done turns this into a totally sighted test and also never presents the subjects with a no change transition, an integral part of the real testing process.
3. No listening test can be sure to have a good mix of believers and non-believers, unless you select the participants using that parameter. In this particular test, I could ask the ones who are goig to participate whether they consider themselves believers or not.
Wrong. University tests often use first year students and participation is compulsory. You get large numbers with effectively random choice - everyone taking a first year psych course - which means you may get some audiophiles of both persuasions and a whole lot of people with no interest in the audio as distinct from music and who have no idea what they are listening to. In fact, if the subjects aren't told what is being changed, even the subjects interested in audio have no idea whether the change is in a component, a cable, or even different recordings of the same signal with some sort of frequency altering going on. It's not always easy to avoid problems of subject bias, but there are ways and bigger samples always help.
4. I know that such a test will not prove anything, but it is able to add evidence to the discussion. If there are valid technical reasons that do show that this test is not reliable, and I'm still waiting for such reasons, then we'd better stop.
If it simply isn't capable of proving anything, it can't add evidence to the discussion. If the test isn't good enough to have the capacity to resolve a difference if one really does exist, it can't generate any evidence whatsoever. It's like giving a person a set of binoculars with the lenses painted black and asking them to describe a distant object by observing it through the binoculars. The fact that the person can't see anything doesn't prove a thing about their vision or the ability of binoculars to assist in viewing far objects. The test instrument has to be good enough to capture reliable data if the results are to provide any sort of evidence at all.
5. Of course I'm not in a position to verify if the participants are able to obtain consistent results, which would qualify them as test subjects. But the common audiophile is not verifying his consistency either when judging audio gear.
Consistency isn't a requirement for a test subject. At close to the threshold of audibility, no person - audiophile or otherwise - is consistent anyway. And consistency isn't necessary in most cases in judging gear. The audiophile simply has to reach a decision that satisfies him/her by whatever means they choose. After all, equipment choice and taste in sound are personal preferences, and are subject to change over time.
6. I perfectly know that the sample size might be too small to give reliable results. One more reason for you to participate :-)
If the sample size is too small to give reliable results, it doesn't matter who participates. The results will always be unreliable because the sample size simply isn't up to demonstrating what you want.When you say that you know the test won't prove anything, and that the sample size may be too small to give reliable results, you are admitting that you can't draw any conclusions from the results at all. How can you if you know it won't prove anything. It's only worth while doing if it is capable of proving something. A single positiive test is never sufficient for proof on it's own - it needs to be replicated, possibly a few times, before the findings are accepted, but no one is even interested in trying to replicate a test that is incapable of proving anything. The results of such a test are simply meaningless.
I would be quite happy to participate in a meaningful test, but I'm not happy to participate in a meaningless test. I also think that it is quite improper for you to seek volunteers for such a test when you know that the procedure is flawed, and to say that you will present the results becuase you are going to present something that is quite meaningless in a way that is misleading. That isn't genuine research and it isn't an honest approach. I would say it was a misguided approach if you weren't aware of the failings in the process, but based on some of the statements in your reply you do know enough to realise that there are significant problems with your approach so to continue with it really is dishonest in my view.
David,even when big-scale ABX tests are conducted with sufficient sample size and appropriate statistical evaluation you will find people like Robert Harley and others who will tell you that and why this kind of test if inherently flawed. Why on earth should I put tremendous effort in this test just to hear that it's useless anyway. On my scale the effort is limited. If you, Jon and others think that it's useless, fine. I don't share your view.
Btw, I'm still waiting for someone to present technical, not methodological, reasons that would speak against the test.
If a test isn't valid, then it's useless regardless of whether the reason is a technical reason or a methodological reason.Conducting a flawed test, regardless of the nature of the flaw, and publishing the results as if they mean something when they can't is simply dishonest as I have said. If you want to conduct a methodologically flawed test, go right ahead, but you can't then honestly claim that the results mean anything at all. You can't use them to support your current view on the topic, and you can't use them for a reason for giving up your current view on the topic. There really is no reason to do the test unless it is both methodlologically AND technically valid.
There really is nothing more to say. If that point means nothing to you, then you may as well believe whatever you like and forget about tests and any form of evidence whatsoever. Conducting a technically correct but methodologically flawed test isn't bad science - it just isn't science at all.
[ The use of the X in ABX - the random insertion of no change transitions, jumbles the order and neither the listener nor the administrator knows after the first presentation - the A presenetation - whether they are listening to A or B. You're suggesting introducing the ability for the listener to be able to know whether they're listening to track 1, 2 or 3 and to compare them at will - a totally different set of conditions to a blind test. ]
Forgive me if I am laboring under a misconception, but from this and your original post, I get the impression that your are confused about what an ABX type test is.In so far as audio is concerned, I take it to mean that the listener (and administrator), know which unit A is and which unit B is, but X is presented as an unknown, and the listener is asked to make a forced choice as to whether or not they feel it is A or B.
The listener can switch back and forth between A and B as many times as they like, and in some instances, listen to X as many times as they like (the case here with CD tracks), and then make their decision.
This is the classic ABX test as proposed by Clark et al, and what most folks are referring to when they speak of an ABX type test.
See:
http://www.provide.net/~djcarlst/abx_new.htm
particularly
http://www.provide.net/~djcarlst/abx_p9.htm
A general description of the ABX test procedure when using the ABX switchbox is at:
http://www.bostonaudiosociety.org/bas_speaker/abx_testing.htmMy comments on this type of testing, and other amatuer DBT tests are at:
http://www.audioasylum.com/forums/prophead/messages/2190.html
and at:
http://www.audioasylum.com/forums/prophead/messages/2579.html
and
http://www.audioasylum.com/forums/prophead/messages/2580.html
That does mean he could get away with 3 tracks as he originally suggested, but not his "ABA" giveaway in his reply to my initial criticisms.Most of my knowledge of testing is on health issues, especially epidemiological studies, which are a different kettle of fish though the research I did was not epidemiological and much simpler. I got enough experience to know I don't have a lot of experience, but at the same time being able to do some basic assessment of published data in my field and to appreciate the damage that sloppy tests which get wide publicity are capable of doing.
A good test is worth every bit of effort it takes, even more than it takes, and it takes a lot. A bad test just keeps on causing problems for years.
You said in your point 4: "If there are valid technical reasons that do show that this test is not reliable, and I'm still waiting for such reasons, then we'd better stop."I gave you valid reasons and so did Jon Risch. I also suggested that you discuss my reasons with a competent statistician or researcher and you obviously haven't. It appears that you regard reasons that you don't like as being not valid reasons. I can understand your interest in trying to do some sort of test in this area, but that eagerness doesn't justify conducting an invalid test and then publishing the results as you have promised to do.
Jon Risch is a qualified electrical engineer who has presented engineering papers and is familiar with the requirements for testing.
I have a postgraduate degree in health and safety and was required to conduct research as part of the requrements for that degree. I had that research published in a peer reviewed journal and presented it at an optometrical science conference. I do not regard myself as a qualified researcher - I don't have a sufficiently strong statistical background nor am I sufficiently expert in test design. I had to submit a design for my research which was critiqued and I was subject to supervision throughout the process. I have not conducted research independently and would not regard myself as qualified to do so. That doesn't mean that I know nothing about what constitutes reliable test data.
If you have 2 people with more experience than yourself telling you that your test design is fatally flawed, you need to seriously consider that advice. If you aren't experienced in test design and conduct, you should discuss the criticisms you have been given with someone who is.
This is not a matter of 2 people with different views to you saying that you are wrong in what you believe about what people hear. This is a matter of people telling you that you cannot gather reliable test support one way or another using your process. I think I can speak for Jon as well as myself when I say that both of us would welcome good, solid test results no matter what the outcome of the test turned out to be. Reliable results help audiophiles make better choices and help designers and manufacturers make gear better in the future, no matter what the results are. I think Jon would welcome good, reliable research and I know I would.
What we are saying is that your approach isn't good, reliable research. What I will personally add to that is that the outcomes of fatally flawed research do more harm than good. They hang around and are quoted without understanding, and are often accepted as fact when they aren't. They get in the way and muddy discussions and make it much harder for interested people to accept reliable results at a later date if those results are at variance with the flawed ones.
I don't believe you want to contribute to more confusion on this topic but that is exactly what you will do if you use a flawed process and publish the results here or elsehwere. If you are going to publish results, you simply have a major obligation to ensure that those results are genuinely meaningful. You can't fulfil that obligation if you don't understand the requirements of a genuinely meaningful test.
Once again, I can only repeat: Get advice from someone sufficiently qualified in statistics and research on how to construct this test before you start. Then conduct it in a way that will generate meaningful data, and that includes getting a large enough group of subjects for your sample. If you're not prepared to do that, simply don't try the research. The results will have no value whatsoever and may do far more harm than good.
.
Jon Risch
![]()
I'd be very interested. I take it the interconnects under test are in the recording chain to the CD-R. It will be interesting to see if the Redbook medium is capable of resolving such differences. (I've also heard that the 4x speed yields the best sound.)One variable that should be a constant is the track access. (Pause and skip functions.) This could mess up the test if the tracks are accessed in different ways.
![]()
do a perfect test?Waste of peoples time IMHO.
Magnetar
![]()
![]()
I would be up for this, then I could do my own double abx by listening with differenct IC's of my own.......
![]()
How exactly could you compare something like the effects of analogue interconnects using this methodology?When recording CDRs the only cable normally involved is the digital IC so you'd have to make some unorthodox configuration to add the effects of any analogue ICs, and even if you did this the advantages of some cables might be with their interaction when driving the speakers which the configuration wouldn't duplicate.
Of course, if you're going to make copies of a CD before and after it's had the green-marker treatment this is another proposition altogether.
Whatever, if you can send to the UK ($5.50 from the US) I'd be one of your Guinea pigs.
People claim to hear differences between different analog interconnects somewhere in their system. If I put the IC between the source and the recorder (or between source and preamp) this, I my opinion, is perfectly valid as test configuration since it's in no way different to what people normally do.
Are you trying to hear the difference between digital interconnect cables? I mean are you making the recording from a digital source to a digital recorder without going through analog? I would propose then that you won't hear a damn thing as Jitter is only an issue in the conversion between digital to analog where the timing is important. If I am wrong about what kind of test you are doing my apologies then maybe you could clarify it a bit for me.
Regards
![]()
If everything was kept in the digital domain (no D/A A/D conversion) then I don't see how that is possible.
![]()
No, the idea was to compare a recording made with digital cable to a recording made with analog cable.As for the phono cartridge comparison, I have only one cartridge. What I could do is use different capacitance settings on my phono stage and see if there's an audible difference.
BH.
![]()
I know from our previous postings that we are both sure that interconnects make a significant enough difference in our systems to justify the (in our cases not insignificant!) cost. Even allowing for the poor quality of CD-R copies (RFI/EMI again?) I feel confident that the differences between interconnects in the recording-chain could be easily and reliably identified, but why bother?I know from my own CD-R (audio) recordings from my stand-alone CD recorder that the choice of digital interconnect can make a very significant difference to the quality of recordings. With the questionable quality I feel is inherent in most PC burners/media and the very variable quality of interconnects, what on earth do you feel you would gain from these discs being played through your system?
No doubt being a Northener you like to be helpful where possible, but I'm still not sure what practical use could be made of the results.
Bill H.
To be honest I'm bemused that there's so much hostility to this exercise, and because I don't feel threatened by whatever the results may be I'm probably more qualified than most to give an unbiased account of what I hear.If Klaus believes that all cables sound the same and wishes to prove this then all I can say to Klaus is that the differences I've heard are so obvious it's not even open to question, and I assume anyone who suggests otherwise is either deaf or has a very poor system; any placebo effect isn't even an option and if it was the most expensive cable tested would always come out best.
I'd take part in Klaus's test because it would be interesting to see if his methodology actually worked, and if the differences of cables could be differentiated on a CDR.
If the differences couldn't be heard the conclusion I'd arrive at is that his equipment or test was seriously flawed, not that the differences weren't there initially.In short, Klaus and his test would be under the microscope but at the end of the day the results aren't going to shake the world of audio whatever they are.
Best Regards,
Chris redmond.
![]()
I have made both digital and analogue recordings on my stand-alone CD recorder with a variety of cables and can confirm that the differences are indeed passed on to the copy CD, and are readily apparent. One would need to have a "tin ear" not to hear the difference. The differences were not subtle!Klaus is, by choice, using a CD burner to generate the copies. Can he not hear the difference between an original and a PC-burnt copy, or is the rest of the system so poor that this not blatantly apparent? A little over a week ago I sent some copy CDs to an inmate who was enquiring why I was so keen on avoiding PC burners and intent on using a stand-alone recorder. On playing the copies he emailed me to confirm that the quality of the sound on copies I had sent (from competent rather than outstanding recordings) was amongst the best in his collection and so much better than copies that he had burnt on his PC that he too was now a convert.
If Klaus cannot hear the problems in PC-burnt copies or indeed thinks that claims of relatively poor MP3 recordings are unjustified, is it his ears or his system that bears the responsibility? Or is he just acting as though he can't tell the difference because he would rather not admit it? Rhetorical questions I know, but there seems to be so little common ground in our approaches that you will understand my scepticism about the exercise.
I have, however, asked Klaus to convince me!
Bill,I'm not using a PC burner but a professional stand-alone recorder that can be used as burner, too (USB). Recording speed is 1x and if necessary, I will use special audio CDs, not data CDs.
I think that if I can make a copy of Windows 98 to data CD and later install 98 from that copy CD to a formatted hard drive of a PC, and it turns out to run without any problems whatsoever, I would suppose, being a layman that is, that both the writing/recording and the CD are of sufficient quality as not to lose any relevant data. Why then should there be a problem with audio ?
Btw, I did not do any cable comparison myself, the main components of my system are pro-gear with standard interface so that cable effects are not very probable to occur. Nor did I record any ABX tracks to CD and try if I can hear a difference.
Hello Klaus,My ears tell me that PCs are a hostile environment to quality audio (RFI/EMI?), hence my refusal to use a CD burner for good recordings. Whether or not there are measurements that one could quote to be able to account for the poorer quality I have consistently heard from PC burnt copies I care not. I am only concerned about whether the music replayed is of good quality. I am sceptical about statistics in relation to sound quality anyway. The fact that a fairly lowly SS amp could well measure much better than a equivalently powered tubed unit does not convince me that it will necessarily sound better!
Relating to the media involved, I have used both CD-R and CD-R audio discs with the same source & music using the same stand-alone CD recorder. Comments in other postings might make you feel that since the statistics are better (here we go again!) that CD-R data disks would be much more likely to make a satisfactory recording. However, in my experience and in my system the CD-R audio discs were/are more realistic. As always, this may not be the case for other people using other systems, but it is certainly true in mine!
Bill H.
> I think that if I can make a copy of Windows 98 to data CD and later
> install 98 from that copy CD to a formatted hard drive of a PC, and
> it turns out to run without any problems whatsoever, I would
> suppose, being a layman that is, that both the writing/recording and
> the CD are of sufficient quality as not to lose any relevant data.
> Why then should there be a problem with audio ?
You are confusing two different things here, Klaus. Data CD-Rs
have much better data redundancies and error-correction strategies
in order to ensure that a data file can be written and recovered
without corruption. Audio CD-Rs are written with much less
redundancy, on the assumption that the error correction will be
able to cope with a higher level of corruption, generally around
200 block errors per second or less.
In addition, audio data is "streamed," ie, read just the once and
fed to the converter, which means that any timing uncertainty will
affect the reconstructed analog signal.
Bits certainly are not bits, when audio is concerned. :-)
John Atkinson
Editor, Stereophile
![]()
The Tascam is able to write both audio and computer data (audio mode and USB mode). Would this mean that the two data writing processes are different so that less care is used when writing in audio mode ?Furthermore, what about the claim that anything in the recording chain affects the sound (for those studios which use cables like Analysis Plus) ?
Are you saying that because of the lower quality in writing audio data subtle differences are likely to be lost ? What about greater differences like CD vs MD or different capacitances of the phono stage ? What about the Sony walkman vs Wadia scenario ?
This could indeed be a valid technical reason that speaks against such a test. However if someone manages to score 100%, would this prove something ?
If I were you Klaus I'd forget about trying to 'prove' anything conclusively.Tests in almost every area you could imagine tend to 'suggest' a certain conclusion, with statements such as 'high correlation' and 'statistically significant' being the order of the day; 'proof' is almost always elusive and hence tends to be subjective.
> The Tascam is able to write both audio and computer data (audio mode
> and USB mode). Would this mean that the two data writing processes
> are different so that less care is used when writing in audio mode ?
Yes. Which is why a data CD-R has less storage area than an audio
CD-R with the same nominal capacity. Take an 80-minute CD-R blank
and burn it as an 80 minutes audio disc using uncompressed WAV or
AIF files. Now take a second 80-minute blank and burn the same audio
files on it as computer files. You won't be able to fit them all
on the disc!
> Furthermore, what about the claim that anything in the recording
> chain affects the sound (for those studios which use cables like
> Analysis Plus) ?
Whose claim is this. While many things have an effect, some of those
effects are large, some are small, some probably don't exist. I
don't know anyone who claims that _everything_ makes a difference.
> Are you saying that because of the lower quality in writing audio
> data subtle differences are likely to be lost ?
Maybe, maybe not. The 16-bit resolution of the CD medium may
not be sufficient to capture the differences you are trying to
detect, particularly as you are probably not going to use ADCs
with true 16-bit performance. Or the problems in people's CD
players, which never perform as perfect 16-bit devices, may
obscure real but very small differences. As was explained to you
by another poster, before you send out your discs you need to
experiment with differences that are known to be audible and see
if they survive the coding and playback. If they don't, then your
proposed test is meaningless.
> What about greater differences like CD vs MD or different
> capacitances of the phono stage ? What about the Sony walkman vs
> Wadia scenario ?
The same points apply. You are first obliged to verify your test
procedure before you can go ahead with the test.
> This could indeed be a valid technical reason that speaks
> against such a test. However if someone manages to score
> 100%, would this prove something ?
Yes, it would show that the difference was sufficiently large
that it survived the possible problems with your experimental
procedure. (That is, by 100%, you mean scoring a high enough
number of hits to be statistically significant, like 15 out of
15. 3 out of 3 does not prove anything.)
Conversely, someone scoring 0 on your test does not prove that
the difference was inaudible. It might be audible, it might not be.
All a null result from your test would prove would be that under
the specific conditions of your test, no difference could be heard.
This is an important point but one that gets lost on many of those
who organize blind tests to "prove" that something is inaudible.
John Atkinson
Editor, Stereophile
> As was explained to you by another poster, before you send out your > discs you need to experiment with differences that are known to be > audible and see if they survive the coding and playback. If they > don't, then your proposed test is meaningless.
These differences known to be audible are known to be audible from sighted listening tests, I suppose. I really don't think that we can use results from sighted tests as reference.Furtermore I would consider as really bad luck that only those bits and bytes which express a difference between, e.g., cables are lost in the recording process.
Apart from that, if such differences are not audible on the CD this could mean both that the recording process does not get them onto the CD and that they are indeed not audible.
The Tascam uses 24bit converters so I suppose, layman that I am, that this issue is not a problem.
> These differences known to be audible are known to be audible from
> sighted listening tests, I suppose. I really don't think that we can
> use results from sighted tests as reference.
No. Before you even start your test, you have to verifyb that its
methodology is sensitive enough. I would do a dry run using something
like a 0.5dB amplitude difference, which prior work confirms is
audible. If your test methodology returns a null result with that
difference, it is unlikely to be any use with more subtle
differences, such as the ones you are talking about in this thread.
Doing correctly designed blind tests that produce meaningful results
is not easy. The literature is full of poorly designed and performed
one (even by myself :-( )
John Atkinson
Editor, Stereophile
0,5 dB difference is audible, that's a figure I have seen in literature. But at which frequency, peak or valley, narrow or broad. What signals did the research use, artificial ones or music. Did they use headphones or speakers ? The listeners of such tests usually have been trained, I'm not. Where's the threshold for the "audiophilis communis" ? Where's my own threshold ? This looks like getting complicated very quickly.Let's assume that I'm able to confirm my threshold to be 1,5 dB. You then would say, yes, your test is able to reveal such "enourmous" differences, but it yet remains open whether or not it can reveal the subtle differences we are considering here. And we are stuck again.
You are saying that the differences introduced by cables, for instance, are more subtle than 0,5 dB. Some audiophiles claim to easily hear this differences, without any training. I bend my head in awe.
I put your name on the list of participants ?
Hello Klaus,Since I am a fairly self-indulgent cove, I am reluctant to give time and effort to any excercise that would not be likely to give me some sort of direct personal benefit. I am very happy with both the cables and recordings I have in my system, and would actively avoid any changes.
Convince me!
I think you should limit your listening panel to alert people who read your original post carefully and realize that supplying cables was not an issue!! ;-)
![]()
b I think you should limit your listening panel to alert people who read your original post carefully and realize that supplying cables was not an issue!! ;-)That lets me out. I reread his original post twice more and still haven't made that realization!
![]()
Sounds like a very interesting project; I'd like to participate!
![]()
I would also be willing to help out. I agree with the last poster in as much as you will have to type up some kind of protocol.
I suppose that you want us to check if we can find any differences in sound from track A, B and X but using our own system as it is, not to try different IC's into our systems.This sounds quite interesting and I would enjoy participating if you don't mind sending the CD-R to Spain.
What would the umpire do? Is he supposed to know "the secret"?
I would suggest to include some questionary to have everybody's answers and features evaluated comparable, and make your statistical analysis easier. In this regard, I'd suggest you to take a bigger sample of 30 inmates, that would make results statistically significative using the most common tests. I suppose that all CD's would have a different track order to avoid "opinion exchange" among testers and any bias.
![]()
1. Yup, you should use your system as it is without making any changes of cables, accessories or whatsoever.2. The umpire knows the secret, yes. This would ensure that I can't mess around with the results.
Great then, I will enjoy participating in the project despite it's not a perfectly rigurous, and scientifically designed test. It will be funny anyway ;-)
If you don't have other volunteers, I could also participate as the umpire, but I'd rather play with the CD.
Regards.
![]()
I know that the test would not qualify as scientfic test given the limited number of samples etc. But still, I think it will shed some light on some issues.Not being a registered inmate. I can't send you a mail via the asylum. Please contact me off-board.
Sounds interesting, I would like to participate.
![]()
Are you wanting to use specific interconnects? If so, I'm assuming you would supply them on a loan basis.
I think the interconnects or whatever you are to test are already recorded into the cds.
Anything that the signal has passed through in the recording / signal routing and processing chain has contributed its' sonic signature to the end product. Sean
>
![]()
I have Van den Hul Bay C5 and I'll try and get some more expensive ones from a local dealer and/or some other from a collegue.
If you supply the interconnects, count me in. Also, advise how you want the tests performed i.e duration, # of trials, etc.
This post is made possible by the generous support of people like you and our sponsors: