|
Audio Asylum Thread Printer Get a view of an entire thread on one page |
For Sale Ads |
12.206.21.98
Maybe a bit click baitish but not really. Years ago I was quite literally banned from the Audio Science Review forums for simply questioning two tenets of the Toole/Olive speaker research. 1. That their research is superior because it is pure science based 2. That their subjective speaker test results were universally transferable to real world use.
#1 is directly connected to #2 so let's start with #2
Floyd Toole followed by Sean Olive created and utilized a system for doing double blind preference tests for speakers using a very expensive mechanism dubbed the Harmon Shuffler. It allows quick switching of speakers in one room without changing speaker position which allows for quick switching double blind preference tests.
Personally I agree with the reason for the quick switching and the double blind protocols. I have no argument with Toole and Olive on the value of that.
Here's the catch (sorry if it's old news for those familiar with all of this) The tests are done 1. in mono 2. in one room 3. from one speaker position 4. from one listener position. 5 using a very narrow range of source material
The tenet is that the preferences listeners have under these very narrow conditions, conditions that do not represent the end usage of stereo playback are universally applicable to stereo playback in a wide array of rooms with various room treatments in various speaker listener positioning configurations with a wide array of source material regardless of the use of DSP room/speaker correction or DSP speaker cross talk cancelaation. (I am mentioning that particular DSP for a very specific reason)
I call this a tenet because to date no one, not Olive or Toole much less the deciples of their religious cult following have ever cited any actual scientific research that proves the claimed universal transferability of those test results to the wide array of stereo playback possibilities
Whenever I have pointed this out the Toole/Olive deciples fall back on the same mantra "read the Floyd Toole book Sound Reproduction: The Acoustics and Psychoacoustics of Loudspeakers and Rooms." I have.Twice. Cover to cover. Gone over specific chapters multiple times. If there are any citations of the research showing transferability I can't find them. When I mention this the "deciples" they retort with a barrage of Ad Hominem and zero substance. The ONLY research anyone has cited was Toole citing one study they did involving two speaker systems in stereo in two rooms of marginally different dimensions. No mention of speaker placement optimization for the non HK based *competetive* product (important to always remember the HK endevour, however scientific it was, alsoe was a commercial endevour) No evidence of any attempt to optimize either room using room treatments for the non HK based designed speakers. No use of DSP etc etc. IOW it was still a very narrow piece of research that hardly supported the tenet of universal transferability of those subjective speaker preference test results.
Pointing this out simply lead to me being banned from the forum.
So I have a couple questions for anyone who believes in the Toole/Olive approach to speaker design and if anyone wants to alert either Toole or Olive to my post please do. I'd love to have them make their case here on a neutral forum where questioning these things is allowed.
question 1. (more of a request really) Let's see that research that shows the preference tests are universally applicable to stereo playback with all the reasonable variables I have already cited
questiuon 2. How do you know your approach gives subjectively better results than the particular alternative of using highly directional speakers in a highly acoustically treated room that substantially reduces all early room reflections and utilizes room/speaker correction DSP and speaker crosstalk reduction DSP? You can't extrapolate the effects of those conditions on any highly directional speaker design using the HK protocols.
Follow Ups:
Harman's design target is intended to please most listeners, and I think it probably achieves that goal. If Revel ever figures out how to make speakers that don't look bland and cheap, they could be an audiophile darling. They still wouldn't be the best choice for everyone though.
One of Floyd Toole's better insights is the "Circle of Confusion", which makes it impossible for any single design target to work best for all source material. So loudspeaker selection cannot be purely objective.
But I find the criticisms of the speaker shuffler to be as overblown as the claims of merit. The whole point of the thing is to control the variables that aren't under the designer's control, so that when making comparisons using the shuffler, listener preferences will be based on things they can control. It's just smart engineering.
Another example of smart engineering is the CTA-2034 measurement method (aka Spinorama). It is IMHO the gold standard for loudspeaker frequency response measurements. If I could find spins for every speaker I get interested in, it would make it much easier for me to winnow down the field. It's still just FR though, so it can't be the final arbiter.
In my opinion, the problem is not with Toole & Olive's research methods, but people's understanding of them. You and I understand that there's more to loudspeaker performance than what the shuffler and spins allow one to compare. A lot of the Harman fanboys do not.
And unfortunately, Sean Olive seems to encourage them. He blurs the line between engineering and marketing.
"Accurate" in terms of wide frequency range, flat frequency response on-axis, no huge dips or flares in the off-axis, and low non-linear distortion (such as that caused by driver breakup or buzzy boxes). I can't imagine anyone actually considering these characteristics UNdesirable.
Conducting the listening tests in mono, with a single specimen of each speaker, was a wise decision as it allows the listener to focus exclusively on the tonal qualities of the speaker, not distracted by "imaging" or stereo effects.
Hiding the speakers from sight eliminates the sort of visual bias that often leads people to think the mahogany or walnut-finished speaker sounds "darker" than an identical speaker finished in white oak. The shuffler is a clever method of allowing the listener to hear two or more different speakers in rapid succession, auditory memory being notoriously short.
The tests have yielded useful data about peoples' sonic preferences (at least under these specific, controlled experimental conditions), and some useful performance targets for speaker designers. Guidelines, not dogma.
I'm mostly in favor of the Harman & related engineering developments. They invest substantial effort and treat it like a real scientific subject with appropriate controls and models but without too much oversimplification.
I've liked Revel speakers for their tonal correctness (very important for classical music) and others which score well on their metrics.
The big gap in my opinion is the use of mono-only measurements & criteria, as that misses a substantial psychoacoustic effect which could strongly influence preference scores.
For instance, I've been a long-time Magnepan fan. In single speaker mono, they're nothing special. But in stereo, they're magnificent.
I have direct experimental evidence. My maggies need repair and I'm currently using KEF Reference One as mains (+ subwoofer). They have Harman-ideal frequency response and dispersion, on par with Revel.
Individually, they're better than the maggies, but in stereo the maggies are a leap ahead. I.e. stereo KEFs are X% better than mono. Stereo maggies are 5X% better than mono.
I have looked at the data they present and I am of the opinion it supports your conclusion not their's
can you elaborate more about what specifically you saw and why?
If you look at th data from Toole's tests comparing mono testing to stereo testing you will see a wider dispersion of preferences in mono and substantial deviations in the results from mono to stereo. To conclude that testing in mono is "better" is to conclude that the results in stereo are inaccurate. But they can't be. The tests in stereo are tautologically correct. That's how we listen. So if a speaker tests badly in mono but tests well in stereo as is the case with speaker BB that demonstrates a problem with testing in mono. Laughably the deciples of Toole and Olive conclude that the favorable test in stereo shows that there is something wrong with listening in stereo. It's a pretty absurd conclusion that can only be reached if one takes the belief that testing in mono is "better" as dogma. These test results do not support a lot of their assertions such as people have similar tastes to the point that we can say taste is universally the same more or less. The individual preferences for each speaker is wider than the average differences between the different speakers. IOW the listener preferences are more different than the speakers themselves. It gets even more eratic when you look at preferences with different program material.
Edits: 04/03/22 04/03/22
Three speakers are ranked by listener panels in both Monophonic and Stereo modes for (a) Sound Quality, (b) Spacial quality.In all cases both Sound and Spacial quality speakers are ranked more similar in Stereo mode than in Monophonic mode.
In case of Sound quality the relative rankings of the speakers remain the same as between Monophonic and Stereo. However in case of Spacial quality, the relative positions of speakers 'AA' and 'E' are reversed.
On the face of it, it would seem that Monophonic listening reveals Sound quality differences more accurately than Stereo. But for Spacial quality Monophonic listening might not be valid.
So what qualities cause a speaker to be ranked differently for Spacial quality? Different reflection characteristics are the likely culprit. Hence dispersion is likely to be the major factor -- but not only the dispersion width but also possibly different response curves at different frequencies.
It is also pretty clear that dipoles speakers have very different reflection characteristics than dipoles and are thus likely to significantly different Spacial qualities.
Dmitri Shostakovich
Edits: 04/04/22
No doubt the difference is reflections. What I find puzzling is how anyone can possibly not see the elephant in the room....the reflections in the room. There is no imaging in mono. Period. Any sense of "spaciousness" one gets from listening to a speaker in mono is generated by the listening room's interactions with the speaker's radiation pattern. That's not imaging. Imaging is a result of spacial cues being encoded in the stereo recording by the differences in the two signals and being decoded in playback by our two ears detecting those differences. That does not happen in mono.
The test is absurd. the idea of imaging and spaciousness in mono is absurd. And clearly the results show th disconnect between trying to judge those particular properties in mono.
You've obviously never heard good mono. A single microphone, in the proper hands, is capable of picking up the necessary reflections, echo, reverberant decay, and other acoustic information helpful or necessary for providing a three-dimensional image through a single speaker or through two speakers. Obviously some mono recordings are better than others, just like stereo recordings.
That is interesting. Of course it should be noted not all mono recordings sound alike, some sound just like good stereo. And some stereo recordings sound like mini. It's easy to riff the system here, guys. Take away the spatial comparison and what are you left with, throw away an outlier or two and they're pretty close, too close to call.
The conclusion that we listen in stereo somehow makes stereo tests more valid than mono tests is probably a logical fallacy. How we hear is not like how stereo records are made, although probably mono binaural recordings come close. Some people claim 78s sound better than stereo 33s, for example, and mono 45s often sound superior to stereo 33s. Yes, definitely a logical fallacy. I suspect you are probably taking the idea that stereo is better than mono as dogma.
Hiding the speakers from sight eliminates the sort of visual bias that often leads people to think the mahogany or walnut-finished speaker sounds "darker" than an identical speaker finished in white oak.> > > > I never thought any of that and I never knew anyone who did. It's not logical. Now, if you said most people think big speakers generally have better bass than smaller speakers or that flat panels sound more transparent than box speakers that might be a different story.
Edits: 03/31/22
If the panel of listeners know which speakers they're listening to, whatever preconceptions they might bring into the comparison could influence their preference. Blind listening in mono is not a good way to audition speakers for potential purpose, but it's a good way for engineers to evaluate design targets and see how you're doing relative to the competition.
Perhaps that kind of logic applies to inexperienced listeners but not to experienced listeners. Not every audiophile is plagued by psychological issues. Some people manage to get themselves all psyched out.
Much bigger issues than color of speaker wood are listener skill and test system performance. Most people have no idea in the world how to set up a legitimate test system. That's why I always comment negative test results of a DBT have no meaning. There are many factors that influence the performance of a given system. Shall we list them all? You can go first.
So it has nothing to do with logic or logical fallacies. The evidence in the Toole study on sighted biases and impacting speaker evaluation was solid and unlike some other studies that one controlled all other variables.
Nt
Nt
scuse, I meant refuge
I'm sure it's not true.
it must be that wild Virginian in you
> > "Accurate" flat frequency response on-axis, no huge dips or flares in the off-axis> >
Interesting thing this flat frequency response. It doesn't lead to a flat frequency response in room from the listener position in stereo. The "room targets" from Toole/Olive studies are not flat. https://www.harman.com/documents/AudioScience_0.pdf page 13/14.
As for off axis response I think there is some circular reasoning here. It is certainly important in a room where early reflections are quite audible. Toole and Olive insist on this being the ideal room because of earlier studies thet cite showing that early reflections enhance the sense of spaciousness. So in effect it becomes a self fulfilling prophecy. They do not ever test speakers that have issues off axis in rooms with room treatment that might be ideal for such designs. In effect they are judging a dragster as a poor race car because it doesn't corner well.
Further more, back in the 80s there wasn't much you could do to change the frequency response of a speaker without doing more harm than good. Now with DSP EQ you can take a speaker with otherwise low distortion and poor on axis frequency response and pretty much completely fix it.
What could have also been done all the way in the 80s to this date was fix some frequency response issues in speakers that did not fit the off axis frequency response bill with careful speaker/listener positioning and room treatments.
IMO this is one of the elephants in the room.
> > Conducting the listening tests in mono, with a single specimen of each speaker, was a wise decision as it allows the listener to focus exclusively on the tonal qualities of the speaker, not distracted by "imaging" or stereo effects.> >
In blind tests one need not be "distracted" by something like imaging. One can focus on each element of the sound quality indivdually. And Imaging, by Toole and Olive's own admission, is nearly half of what makes a speaker sound subjectively good. And I definitely dispute their assertion that mono testing universally transfers to stereo when it comes to imaging.They have not proven that at all. In fact their own data shows otherwise.
I would go so far as to say their interpretation of their own data in their mono vs stereo tests is skewed towards wanting mono to be a better test. I think their interpretation of that data is way off. Akin to a famous dad joke.
Scientist studying frog jumping. "Jump froggy jump!" frog with 4 legs jumps 4 feet. Cut of 1 leg frog jumps 3 feet.cut off 2 legs frog jumps 2 feet, cut off 3 legs frog jumps 1 foot. Cut off 4 legs. "Jump froggy jump" nothing happens. "JUMP FROGGY JUMP!!!!" nothing happens.Conclusion: frog with no legs is deaf.
I largely agree with everything else in your post.
Interesting thing this flat frequency response. It doesn't lead to a flat frequency response in room from the listener position in stereo. The "room targets" from Toole/Olive studies are not flat. https://www.harman.com/documents/AudioScience_0.pdf page 13/14.
Are you familiar with the circle of confusion problem? Music is produced using speakers that may have a flat on-axis response, but definitely don't have a flat power response. And different engineers are using different speakers. Hearing the music as the engineers heard it or intended it is a commonly stated goal, and that requires speakers with on-axis and off-axis frequency response characteristics similar to what was used by the engineers. You might ask why engineers and end users don't just equalize to flat at the listening position, kind of like the X-curve utilized in film production? There are two reasons, but let's talk about the most obvious one first. Engineers can't do that because their end users are listening on all sorts of different setups: hi-fi speakers, headphones, ear buds, car stereos, table radios. The vast majority are not equalized flat at the listening position, so the product would just sound bad for the majority of listeners. And listeners can't do that either, because the vast majority of recorded music available wasn't produced that way. Hence the circle of confusion. The Harman curve is an attempt to find a target that works well for the majority of recorded music out there. It's not going to be ideal for all source material or all listeners, but it's a reasonable starting point.
The other problem with equalizing for flat response at the listening position is that it prioritizes the loudspeaker power response over the on-axis response. That can lead to screwing up the on-axis response because the EQ is compensating for flaws in the off-axis response. Most speakers with dynamic drivers have an off-axis dip just below the crossover point between mid & tweet due to the mismatch in directivity between the two drivers. If you correct for that at the listening position, it often creates a peak in the on-axis response in the upper midrange or lower treble, which makes the speaker sound forward or bright.
As for off axis response I think there is some circular reasoning here. It is certainly important in a room where early reflections are quite audible. Toole and Olive insist on this being the ideal room because of earlier studies thet cite showing that early reflections enhance the sense of spaciousness. So in effect it becomes a self fulfilling prophecy. They do not ever test speakers that have issues off axis in rooms with room treatment that might be ideal for such designs. In effect they are judging a dragster as a poor race car because it doesn't corner well.
I don't agree with Harman on early reflections. Late reflections enhance spaciousness, not early reflections.
Psychoacoustics research suggests that early reflections are hard to separate from the direct sound. Maybe people hear them differently, but to my ears, early reflections cause time smear and hurt imaging but don't affect the tonal balance that much. When I see them in the impulse response, I treat them. But most people don't. If I were designing loudspeakers to sell, I would assume that most of my customers are listening in untreated rooms. Some rooms will have irregular geometry and/or stuff that breaks up the early reflections and damps the reverb tail, but other rooms will be quite lively. In order to design something that will sound good in a lively room, you really need to pay a lot of attention to the off-axis. So even if they're wrong about the desirability of early reflections, I still think they are doing a good thing by trying to control the off-axis response.
I also agree with you that listeners who have put effort into treating first reflections (like me) don't need to look as critically at the off-axis.
Further more, back in the 80s there wasn't much you could do to change the frequency response of a speaker without doing more harm than good. Now with DSP EQ you can take a speaker with otherwise low distortion and poor on axis frequency response and pretty much completely fix it.
Completely fix it? No. You have to make a tradeoff with EQ. You can use it to optimize the on-axis anechoic response, or to optimize the directly arriving sound, or the early reflections, or the overall power response, or some balance of all the above. But you can't optimize them all simultaneously. If listening in a near-field setup in a treated room, the best choice IMO is to measure the speaker under pseudo-anechoic conditions and optimize the on-axis response. If listening in the far field in a large, lively room, it's better to equalize at the listening position, but going for a house curve rather than flat.
In blind tests one need not be "distracted" by something like imaging. One can focus on each element of the sound quality indivdually. And Imaging, by Toole and Olive's own admission, is nearly half of what makes a speaker sound subjectively good. And I definitely dispute their assertion that mono testing universally transfers to stereo when it comes to imaging.They have not proven that at all. In fact their own data shows otherwise.
We completely agree on this. It's an assertion based on presumptions not backed up by data, and it certainly runs counter to my experience. Some of the best imaging speakers I've heard are not very flat. For example, Michael Borreson (designer of Raidho and Borreson speakers) specifically tailors the frequency response of his designs to enhance soundstage depth and imaging. That creates other problems, but with the right music selection it can be pretty impressive.
Early and late reflections both "enhance spaciousness" sometimes. Early reflections might be hard to distinguish from direct sound, but sometimes you don't need (or want) to distinguish the two.Spectrally correct wall reflections combined with direct sound can create the most massive "epicenter of sound" possible. And while "imaging" might deteriorate slightly,"soundstaging" or overall sense of scale will probably improve.
Early reflections and direct sound combine sound together to create a virtual wall of sound. The stronger the wall reflections are, the larger this virtual wall seems to be.
A large virtual wall of sound and speakers that seem to disappear within it are two important ingredients of "spacious" sound.
Edits: 04/04/22 04/04/22
I am quite familiar with the circle of confusion. It goes much deeper than what you describe. Fact is over the 100+ years of recorded music you have no standardization of monitoring or recording. But that's reality. It can't be fixed.
As for fixing frequency responses of course no EQ can fix a speaker for all environments simultaneously. That's the nature of speaker/ listening spaces. But it can do the job for any specific listening space.!
As for flat response I think this is an issue that a lot of Toole/Olive disciples don't really get. Indeed flat response from the listener position tends to b bright and the HK based designs with wide dispersion don't measure flat from the listener position in stereo. So one can argue flat isn't such a simple issue in audio.
But my recent experiences with the BACCH4Mac DSP has been something that is making me rethink stereo playback as a whole. When I set the EQ to flat without the DSP it is too bright and forward sounding but with the BACCH4Mac flat frequency response from the listener position becomes the subjectively best sound. The DSP seperated the sound sources from their reverb so well that it alleviates a congestion and nasality in the treble region that you wouldn't know was there until you hear the DSP fix it. I think there is a lot of comb filtering as well that is being fixed. IMO the Toole/ Olive fan boys need to take a good hard look at speaker crosstalk cancelation DSP. It has caused me to rethink a lot of things included my the issue of accuracy
Toole certainly cannot control for the effects of individual rooms but he didn't ignore rooms reflections. To address this Toole found that uniform response, i.e. similar frequency response curves at various degrees off-axis was also a critical aspect of measuring for good speaker performance.It's true that a constraint of Toole's testing was that it only reliably pertained to unipole speakers and not so much to dipole models such as electrostatics or Magneplanars. But at least for unipole speakers, IMO, soundstage as well as transparency can be inferred for single-speaker testing. Both of these attributes, of course, mostly depend on the recording: accurate, (vs. euphonic), sound must not be judged on the seeming ability of playback components to simulate such effects.
If there is one aspect of speaker performance that Toole and Olive underplay, it's probably distortion. Distortion harmonics are easily detected by objective tests. But again, similar to electronics, it's mote whether people actually prefer a little distortion, say 2nd/3rd order, versus none.
Single speaker audition, according to tests by someone, maybe Olive, tends to make the listeners more attuned to speaker difference than stereo listening.
Blind testing is valid because people tend to have biases based on appearance, price, etc. Geoffkait and I, needless to say, do not but most people do.
Dmitri Shostakovich
Edits: 03/31/22
> > Toole certainly cannot control for the effects of individual rooms but he didn't ignore rooms reflections.> >
Not only did he not ignore them he made them a requisit of speaker performance in testing room. So he in effect closed the door on speakers that "don't corner well."
> > To address this Toole found that uniform response, i.e. similar frequency response curves at various degrees off-axis was also a critical aspect of measuring for good speaker performance.> >
He simply limited the scope of the studies to commercial interests. Buyers with small,"normal" or "average" rooms with noconsideration of dedicated rooms that can and often will be supplimented with room treatment
> > It's true that a constraint of Toole's testing was that it only reliably pertained to unipole speakers and not so much to dipole models such as electrostatics or Magneplanars.> >
Ah, but that wasn't their conclusion. their conclusion was that allof those dipoles were simply inferior designs.
> > But at least for unipole speakers, IMO, soundstage as well as transparency can be inferred for single-speaker testing. Both of these attributes, of course, mostly depend on the recording: accurate, (vs. euphonic), sound must not be judged on the seeming ability of playback components to simulate such effects.> >
I don't think their tests offer up much support for this belief. there are clear gross flaws in their stereo vs mono test.It is way too small of a study to draw concrete conclusions and IMO their interpretation of the data that I am seeing is more wishful thinking than objective analysis.
... they prefer not to believe them. The same is true for measurements in general. More broadly, why do people reject science in favor of pseudoscience and misinformation? ... Because the latter is more consonant with their biases and prejudices.
In as much as individual rooms tend to reflect different frequencies to differing degrees, there can be no such thing a single, fixed standard for anticipating the result of reflections. The best anyone can do is to look for as nearly identical frequency response curves on- and off-axis -- which makes the grossly simplifying assumption that all frequencies are reflected equally.
Toole's experiments included the effect of standard-room reflections. So with regard to dipole planars perhaps he is right the samples he tested were simply poor designs. OTOH dipoles will have different reflection characteristics from unipoles so anechoic and "Spinorama" results are potentially misleading.
Dmitri Shostakovich
Absolutely true. There just can't be a "standard" room. And the HK based designs are aimed at common rooms. Other speaker designers make their assaults on the state of art with the expectation that the audiophiles who are after state of the art will make the effort to tailor the room to work as well as possible with the speakers. It seems that the HK camp doesn't believe this is a legitimate approach or that it might work
there was a time in the not too distant past when a person could go to a local audio store that he could listen to things or ask questions, and if the place was trustworthy, he could get a straight honest answer on anything audio. I was fortunate enough to travel into Dunellen and listen, talk with George and Mark at Personalized Audio. Lotsa stuff there, electrostats tubes TT's, many names that were once prime and made of dreams.
They went on to bigger things like Melos Audio (Song) and were even busier then, but I remember the after hours parties where everyone would listen to anything they had in the store. Questions answered by guys (one female once in a while and behavior increased of course) who had been around the block for many years, mostly tube some SS. And shootouts with different brands.
Those days are now long gone but at least I learned enough to trust my ears, not some online reviewer or a website that became popular because it touted their numbers, data, nomenclature, or subjectivities. Guidance here on AA would best be verified from known people who shared a similar philosophy and wisdom. If you prefer amp feedback and Impersonalized instruction then so be it.
We've all had the wrong girlfriends one time or another just like equipment. I trust my ears, knowledge, and experience, not flashes in a pan which is too common in audio
Maybe a bit click baitish but not really. Years ago I was quite literally banned from the Audio Science Review forums for simply questioning two tenets of the Toole/Olive speaker research.
I think this is wrong..... But it's a private entity with its own rules.... (It's no different from Twitter banning someone inexplicably.) The moderators can ban you for any reason they so please.
If this were a truly public space, like a government provided forum, it would be a different story.
1. That their research is superior because it is pure science based 2. That their subjective speaker test results were universally transferable to real world use.
I guess whoever defines and dictates the "science" makes the rules......
#1 is directly connected to #2 so let's start with #2
Floyd Toole followed by Sean Olive created and utilized a system for doing double blind preference tests for speakers using a very expensive mechanism dubbed the Harmon Shuffler. It allows quick switching of speakers in one room without changing speaker position which allows for quick switching double blind preference tests.
The best speaker location in a given room for one speaker may be totally different from the best speaker location in a given room for a second speaker..... No two speakers perform ideally in the exact same location in the room......
And the Linn people would insist evaluating speakers with no other speakers in the room.....
My point is any test method is actually arbitrary, and the actual science is a lot broader than a lot of these people can even imagine.
Personally I agree with the reason for the quick switching and the double blind protocols. I have no argument with Toole and Olive on the value of that.
For this angle, I totally disagree.... This isn't even a "double blind protocol"...... Whether the test method is valid requires independent audit.... (And an audit might determine the test to be valid.... But it should not be called "double blind".... Because it is totally removed from the actual protocol for "double blind" testing.)
Here's the catch (sorry if it's old news for those familiar with all of this) The tests are done 1. in mono 2. in one room 3. from one speaker position 4. from one listener position. 5 using a very narrow range of source material
You personally revealed flaws in the test methodology.....
I don't think "soundstage," for example, can be evaluated with one mono speaker.
The tenet is that the preferences listeners have under these very narrow conditions, conditions that do not represent the end usage of stereo playback are universally applicable to stereo playback in a wide array of rooms with various room treatments in various speaker listener positioning configurations with a wide array of source material regardless of the use of DSP room/speaker correction or DSP speaker cross talk cancelaation. (I am mentioning that particular DSP for a very specific reason)
You stated more flaws in the test methodology.... (You stated this more articulately than I ever could.)
I call this a tenet because to date no one, not Olive or Toole much less the deciples of their religious cult following have ever cited any actual scientific research that proves the claimed universal transferability of those test results to the wide array of stereo playback possibilities
Agreed......
Whenever I have pointed this out the Toole/Olive deciples fall back on the same mantra "read the Floyd Toole book Sound Reproduction: The Acoustics and Psychoacoustics of Loudspeakers and Rooms." I have.Twice. Cover to cover. Gone over specific chapters multiple times. If there are any citations of the research showing transferability I can't find them.
Someone who has a good scientific background in the subject should kindly explain this..... Or maybe point it out in the books/chapters.... The key here is clarification and assisting in discovery, not "I know it and you must find out yourself".... (Most of these people cannot even explain what they claim, they use "books" as if they're certain the information is in them.... Yet the information may not even exist.)
When I mention this the "deciples" they retort with a barrage of Ad Hominem and zero substance.
I've learned to brush that aside..... (This was something I learned the hard way.... I didn't always do that.... Just look at some early discussions I've had here on AA.) What really confounds people is when the topic at hand is stuck with, avoiding any personal retorts.
The ONLY research anyone has cited was Toole citing one study they did involving two speaker systems in stereo in two rooms of marginally different dimensions. No mention of speaker placement optimization for the non HK based *competetive* product (important to always remember the HK endevour, however scientific it was, alsoe was a commercial endevour) No evidence of any attempt to optimize either room using room treatments for the non HK based designed speakers. No use of DSP etc etc. IOW it was still a very narrow piece of research that hardly supported the tenet of universal transferability of those subjective speaker preference test results.
This is a barrage of moving targets.... To come to a conclusion of this is uhhh... as subjective at it comes.
Pointing this out simply lead to me being banned from the forum.
Sometimes, explaining differential calculus to a gorilla just isn't worth the time and resources...... And it often makes the gorilla really angry......
So I have a couple questions for anyone who believes in the Toole/Olive approach to speaker design and if anyone wants to alert either Toole or Olive to my post please do.
I've never been a speaker designer, and never read the literature.... It's probably a valid approach within reason..... Although it looks like the "within reason" was tossed out the window with your particular ordeal.
All design theories are just that, design theories..... A designer has to determine which ones he believes would be most effective in developing a superior marketable product. And no two designers have the exact same approaches, in this regard.
I'd love to have them make their case here on a neutral forum where questioning these things is allowed.
If their case is gospel within their own fiefdom, it's their problem.... I think proving a specific "scientific" entity wrong wouldn't make a dent in regard to the at-large design practices in the audio industry.....
There are a lot of designers who seem to value Audio Science Review's validation of their products..... That's no difference from having approval from THX..... It otherwise doesn't really mean much.
question 1. (more of a request really) Let's see that research that shows the preference tests are universally applicable to stereo playback with all the reasonable variables I have already cited
What these individuals do within the fiefdom is what it is..... They can be nice and participate in an outside forum, but that doesn't mean that they will..... (What goes on within the fiefdom will be censored, and nobody will realize it has taken place.... Unless they also happen to visit AA. But that is likely less than one percent of the subscribers there.)
questiuon 2. How do you know your approach gives subjectively better results than the particular alternative of using highly directional speakers in a highly acoustically treated room that substantially reduces all early room reflections and utilizes room/speaker correction DSP and speaker crosstalk reduction DSP? You can't extrapolate the effects of those conditions on any highly directional speaker design using the HK protocols.
I think they're framing a leap of faith as "science".... Otherwise it does seem outlandish enough to where nobody outside this fiefdom would really take it seriously.
Let's take a look at that statement more closely. The general idea is that speaker drivers of other speakers in the room would act passively and interfere with the pure sound of the speakers under test. That might be true is there was a passive radiator in the room, but probably not for most other speakers, they won't react enough to acoustic waves in the room to be audible. So, if the sound is degraded it probably isn't related to the theory that secondary speaker drivers are responsible.
What is more interesting is that even if you removed the drivers from secondary speakers in the room, not even connected to the system, the sound in the room would be worse than if the secondary set(s) of speakers were removed from the room. Furthermore, any secondary electronics in the room or secondary cables in the room even nor plugged in or connected to the system degrade the sound. But it gets even worse, any cell phones or land line phones in the room degrade the sound. Even if they're turned OFF.
This is just the tip of the iceberg, gentle readers. These issues and others like them are important when it comes to controlled blind testing, as most people are unaware of theme or dismiss them out of hand. And they are some of the reasons I say there's no such thing as a 100% perfect test. There are too many variables including unknown variables like these.
An ordinary man has no means of deliverance.
What would a speaker test look like that would cover listener preferences that
"was universally applicable to stereo playback in a wide array of rooms with various room treatments in various speaker listener positioning configurations with a wide array of source material regardless of the use of DSP room/speaker correction or DSP speaker cross talk cancelaation."
I doubt such a test is possible.
-----
"A fool and his money are soon parted." --- Thomas Tusser
Nothing is perfect - if a DBT were perfect there would only need to be ONE trial - once you start with 10 or 16 or more - you are hoping to get statistical correlation.
But using basic logic - what do audiophiles and reviewers babble on about a LOT on audio forums and in reviewers.
Imaging
and
Soundstaging.
Two things that can't be 'tested' in blind tests with one speaker in the middle of the room.
99.9% of everyone listens to Stereo systems (or multi-channel so even more than stereo).
So to be a valid test of any type - the test needs to be in at least "stereo"
I mean this is basic logic and the bare minimum for a credible test.
The reason I suspect most tests are not done in this manner is it's too hard and too expensive - which is fair enough - but then don't ask people to take the results as gospel especially when independent blind level matched tests of speakers have people choosing "different" speakers.
If there was only one trial and it was negative you would be unable to draw any conclusion. A single test that has negative results is meaningless because of *all the things that can and do go wrong in an audio test*. The best laid plans of mice and men oft go awry.
That's why I said a perfect test. A perfect test accounts for ALL 100% of anything that could go wrong.
This is what you learn in Psychology 101. The DBT in the case of audio is a psychology test - not a test that should be conducted by engineers who don't know what they're talking about.
They are grafting pseudo-knowledge of the DBT which was made for the medical profession onto audio testing - and it's not the same thing.
It's baffling to me that otherwise smart people keep putting all their eggs on these tests that really aren't done very well from the validity or even statistically.
And no offense to Toole but he is a paid employee in a company that sells speakers - so his paycheque comes from getting results Harman wants - even if he is 100% honest - this is called a "conflict of interest" and there needs to be independent peer review.
It baffles me that otherwise smart people ignore all this and just accept whatever Harman says.
One speaker in the middle of a room on a shuffler - I mean it's just flat out ridiculous that they view this as the gold standard of DBT and should be taken 100% as gospel.
One definition of a perfect test is one that cannot be passed. The dunking chair was an example of a perfect test, if the woman died in the dunking chair she couldn't have been a witch since you can't kill a witch.
"They are grafting pseudo-knowledge of the DBT which was made for the medical profession onto audio testing - and it's not the same thing."
DBTs are for anything pertaining to human studies and I'm pretty sure the ear/brain falls under that umbrella.
Psychoacoustics is an entire branch of science. It has been around for a long long time and a great deal of technology has been made possible from a multitude of studies in psychoacoustiscs. All recording and playback technology is tied to studies in psychoacoustics. All of it. And double blind protocols have been the gold standard for that entire field of studies.
There is nothing inappropriate about using blind protocols in any listening evaluations. Quite the opposite. Without blind protocols the results are considered junk by any scientific peer review panel.
And when it comes to executing blind protocols in listening tests your title doesn't matter. Your protocols matter. Participants in blind listening tests don't turn deaf because the proctor doesn't have an M.D. behind their name.
I have no issue with listening blind and level matched - this is pretty obvious to take out a myriad of sight and past experience bias. The D and the B in DBT have no problem. It is the T where the problems arise.
To restate - listening blind and level matched removes all kinds of bias that any subject brings with them into the test. Indeed, I had a terrific professor who had all students write a code number for their essays and tests because he said if he knows the students he may favour a student he likes and give them the benefit of the doubt when marking or doing the opposite for a student he didn't like. So when marking he had no clue who anyone was. At the end of the course, you put your name beside your code. Everyone knew they were equally treated. So blind definitely has an advantage.
In medicine, the blind test is set up where 1/2 of the patients get the real drug and the other 1/2 get the placebo.
The medical staff doesn't know which pill is which so they can't inadvertently lead the patient in knowing whether they are getting the real drug or the placebo.
This is a physiological reaction. The pill either works or it doesn't work and it can be measured. Depending on what the illness is it is possible for people who believe in the placebo as the real drug could be affected such that they were actually taking the real drug.
In Audio, there is a myriad of problems because it is no longer a biological or physiological test but has moved into the field of psychology. Biology is - I am sick with X and this pill cured X - it either works or it doesn't work.
Listening falls into a person's ability to transfer a sense into their brain and make something of it. And then spit out the correct answer. In psychology and education, we know about 'test stress'. We know that a student for example can stone-cold know an answer to a problem/question but when they are in the exam setting can't answer it. They freeze. They select the wrong answer and 5 minutes after the test is done - the light goes on and damn - "I knew that." It happens a LOT.
The brain wants to solve problems and move on - it stops the brain from becoming fixated on something - people who suffer from OCD have the opposite problem - they can't let things go until the problem in front of them is solved - their brain can't let it go.
An example of this is merely looking at cloud formation - people stare at the blob of clouds - the brain works to solve it by giving us some familiar shape from our knowledge base - I see a dog you see a dragon - it's a blob of clouds - we see these things because our brains want us to move on and not get stuck trying to solve patterns. It's an unreliable brain/sense interface but it does aid in our evolution because we don't want to be staring at the clouds when the bear is coming to eat us.
So the T is where things fall down. As well as the tabulation of results - for some weird reasons some of these testing outfits add scores from different listeners together. As noted earlier the idea of testing one speaker in the middle of a room - not two in stereo - what about corner-loaded speakers like mine. There needs to be "fairness" to the products in such tests and they need to be independently tested (peer review) to confirm the results.
I am supportive of listening blind and level matched - if you know what I own you know that the only way I can convince people that it sounds better is to have folks listen level matched and blind - I have grown to be a pretty big supporter of it precisely because I want people to LISTEN and not prejudge it because the read the middling measurements or looked how it is set-up in the room. People hear what they want to hear if they are biased for or against something when they walk into the room.
Again - my issue is with the testing parameters and the statistics (ie it needs more trials for subtle differences in auditions).
"To restate - listening blind and level matched removes all kinds of bias that any subject brings with them into the test. Indeed, I had a terrific professor who had all students write a code number for their essays and tests because he said if he knows the students he may favour a student he likes and give them the benefit of the doubt when marking or doing the opposite for a student he didn't like. So when marking he had no clue who anyone was. At the end of the course, you put your name beside your code. Everyone knew they were equally treated. So blind definitely has an advantage."> > > > > I don't think that example is analogous to audio testing at all. In fact, it's probably better all the way around for the professor to know his students more intimately and to figure out who the smart ones are and who the poor students are. If the class is very large the professor won't have an opportunity to know his students well enough to show favoritism, if the class is small, the professor can't help knowing his students more intimately. I would think knowing his students are would allow for help for those that needed it during the course.
In blind testing isn't the general idea to avoid seeing the item(s) under test? As if seeing them or knowing the price would color the evaluation of its sound. That's a psychological bias, not favoritism. For favoritism to exist there has to be prior experience.
Edits: 03/30/22 03/30/22
Bias and favourtism is the same thing - you favour something before hearing it or you dislike something because maybe you don't like the owner of the company or you believe you hate metal tweeters so you are convinced you will hate the speaker as soon as you see it has a metal tweeter.
As for the professor - it is the test and the essays submitted which will indicate "who the smart students are."
The student who doesn't ass-kiss - is quiet and does the work is evaluated on his/her merit alone. The sexy blond who has a gift of gab may flirt her way to a higher mark if the professor is the type to succumb to such things. I have seen it a few times. I was given a B in an acting course where we were told that the marks would be primarily based on effort, not ability - it was an elective so I took the course.
A pretty female student received an A and she missed 30% of the classes. So if the effort is the primary grading scheme then missing 30% of the course should count against her.
I had half a dozen back and forth e-mails with the professor citing the issue and he changed my grade to an A. She was a better acting student than me but the course was based on effort notability - and he didn't offer the University grading scheme of A+, A, A-, B+, B, B- - as this was an elective course where the grade matters for overall GPA. If he had given me an A- or B+ I probably would not have bothered.
In another course in Geology, I handed a project in with the same answers as a pretty girl - I got a B- she got an A. Exact same answers. Not much I could do about it because that professor had Tenure and had I known I would have avoided that professor (a 6000-year-old Earther nut).
The codes were for assignments so if you got a lousy grade on an essay you could still go and get some help from the professor.
And he was quite generous to the folks who asked for help - in my case I was getting 100% and an A+ on every assignment and exam - I tanked the essay. Philosophy is a completely different style of writing from other fields. He allowed me to take another exam instead of the essay on the promise that I take a writing course. I took the exam - got an A+ and signed up for a writing course.
My final answer is expectation bias is not like favoritism at all. Be that as it may I think you probably agree that negative results of any single DBT are worthless, right? You did seem to agree there can be no such thing as a perfect test.
Edits: 04/01/22 04/01/22
A DBT only tells you that person in that particular test could not (or did) tell the difference between A and B within that particular test.
The result of the said test can't tell you thing one about A and B with a different person listening outside that particular test environment.
You're not following my reasoning. If a person can't tell the difference in a DBT it doesn't mean there is no difference. So what good is the test? You can't claim test article A failed the test. That's why I often say a single DBT - or any type of test - that has negative results has no meaning. If the results are positive for one test it might mean something, but for all we know he might be hearing things. Perish the thought. So, we won't know anything more until a lot more tests are done on different systems with different people.
Edits: 04/02/22 04/02/22 04/02/22
Well, there is value in a blind test and that is to make sure people are being ripped off by manufacturers who make crazy claims.
People suffer from all kinds of biases (looks bias, price bias, name brand bias, science bias, personality bias, weight bias).
Product A looks so much beefier and better than Product B and Product A is priced at $2,000 and Product B is $200, and product A is made by a famous name, and Product B is made by Best Buy, and Product A is made by a gifted salesman who can talk the talk and Product B is made by a seeming smarmy used car dealer.
Testing blind eliminates all that crap and compares who it sounds and ONLY how it sounds.
As a consumer - if I am not wealthy I don't want to spend $1800 more for something unless I know it's actually going to sound not only "better" but at the very least different.
A lot of people go in and can hear a difference sighted (with all the bias influencing them - not least of which audio dealers who will play the expensive unit slightly louder which is also known to impact preference).
So Product A which is supposedly a MASSIVE improvement in every way should be a massive improvement in every way when auditioned blind and level matched - if Product A is truly so great it should be truly so great when auditioned blind and level matched.
Ie; you should be able to select it as the pick of the litter with just your ears and without bias. You should be able to select better than a mere coin flip.
That's the whole point of listening blind. I am not opposed to anyone who wants to do it this way because it's their money and they don't want to be fooled by the many many hucksters in the audio industry who if they weren't selling BS products wouldn't be able to get a job at a gas station.
My issue isn't with any of the above - I am for the above - what I am against is the mediocre rubbish tests like the single speaker on a speaker selector in the middle of a room that Harman does (complete worthless BS that is IMO so shoddy it was clearly designed to just help them sell their products) and people trying to jump to conclusions like - "these people in this test failed to distinguish a difference between these two cables so that means no one can tell any difference between any cables anywhere"
The science behind the DBT isn't wrong - it's the crappy ass implementation and conclusions people make about the results that leave much to be desired.
Moreover, the fact that Joe and Sally chose X speaker in a Harman controlled test with one speaker in the middle of a room against 5 other speakers doesn't say crap all about what Richard will choose with stereo speakers against 5 other completely different loudspeakers in a different room with different music.
Harman tests one speaker mid-room - my speakers are designed for corner placement - how does Harman fairly test for that? Does their shuffler have a backing corner? Doubt it. Who chooses the competing speaker? Do they choose blindly the competing speakers or do they choose them with pre-knowledge that they will perform badly mid-room and by themselves? Not mentioned in the report? Why are not all the competing speakers listed with serial numbers?
I'm afraid you still don't get it. That's OK, I'm plum wore out. This conversation can serve no purpose any more. As Bob Dylan says at the end of his records, good luck to everybody.
Clinical trials no longer use placebos for testing anything that has a measurable physiological effect. They just use control groups.
As DBTs pertain to speaker evaluation and the research of Toole and Olive it's not really that complicated. They are pretty straight forward "what do you like better"
Audiophiles do this under sighted conditions as a normal part of shopping for their audio systems.
Psychoacoustics is an entire branch of science. It has been around for a long long time and a great deal of technology has been made possible from a multitude of studies in psychoacoustiscs. All recording and playback technology is tied to studies in psychoacoustics. All of it. And double blind protocols have been the gold standard for that entire field of studies.> > > > Logical fallacy no. 1
There is nothing inappropriate about using blind protocols in any listening evaluations. Quite the opposite. Without blind protocols the results are considered junk by any scientific peer review panel.
> > > > > Logical fallacy no. 2
And when it comes to executing blind protocols in listening tests your title doesn't matter. Your protocols matter. Participants in blind listening tests don't turn deaf because the proctor doesn't have an M.D. behind their name.
> > > > Logical fallacy no. 3
Edits: 03/29/22
I don't think there is an answer from a manufacturer's perspective. I think designers ultimately have two considerations. Their interest in making money and their interest in expressing their personal ideals as a designer. I totally get the HK approach for commercial purposes. They try to make speakers with wide appeal. Great way to make money. I get that. I'm just not buying their claim of universal transferability. That is their claim. I'm not saying they don't make good speakers. I just don't buy this cultish idea that there's is the only legitimate approach and the everyone else's is pseudoscience.
I've never understood this obsession of banning people..... It's as if the moderators on these sites believe they're somehow exhibiting authority of "high morals" or something......
I was once banned on a site called Hydrogen Audio..... The moderators there may have ultimately restricted their audience to almost nothing..... I haven't seen much go on over there in recent time.
Aside from maybe keeping order on a discussion board, I could never relate to such policy.
I got banned there too for expressing a preference for vinyl playback
"I got banned there too for expressing a preference for vinyl playback "
Are you sure it was *simply* for expressing that preference?
I'm a member over on ASR and I've continually expressed a preference for vinyl. Not even a hint of a ban for doing so. The general concensus is personal preference is perfectly fine.
Similarly, the various claims made by the research cited by Toole et al have been batted around, scrutinized and critiqued for years at ASR. I haven't seen anyone banned for simply doing this. (I've even questioned the mono speaker blind tests on some grounds).
Which leads me to wonder: Is it possible it was the manner in which you posted that might have got you banned? I've seen others on forums say they've been banned from ASR and in looking at their posts...it become more apparent why they were banned. I'm NOT saying this applies to you, but it isn't usual to be banned for the things you say you were banned for.
No one seems to be agreeing with Toole and Olive here so far. I really want to know if I am missing something
What's your game plan, gang up on them? Exorcize them? Point out all their logical fallacies? Lol
Edits: 03/29/22
Now that I think about it I think it had more to do with a philosophical argument over the meaning of "better" the regulars were actually ok with my preference so long as I didn't say that vinyl sounded "better" and acknowledged that digital was "better." I was not having it. I told them that I agree digital media is more accurate but "better" is inherently subjective and a matter of personal preference. So for me vinyl is better. If I remember correctly *that* argument was what did it. I had a lot of ad hominem thrown at me and I am not the sort who turns the other cheek. So I gave as I was given but I never wavered from my point. IMO this was an ego driven display of posturing and ultimately they would tolerate my preference so long as everyone including myself acknowledged that it was an inferior preference. It took them out of their comfort zone. Most who go there and make that argument usually mistakenly claim digital media is less accurate and audibly flawed. That was a comfortable argument for them to dismantle since it was plainly wrong and easy to dismiss without them having to challenge their own belief system. I think the idea that more accurate doesn't necessarily mean "better" took them way out of their comfort zone and they did not have a legitimate rebuttal. Ironically a few of them cited Toole and Olive's speaker research as proof that more accurate =subjectively "better." I really pissed them off when I pointed out that Toole and Olive were advocates of early reflections to enhance speaker sound quality and so in effect they were claiming something that is fact less accurate to the source signal was "better" that really pissed them off
Edits: 03/30/22 03/30/22
Did you do a controlled blind test to establish your preference for vinyl playback?As I recall, that is what Hydrogen Audio rules require.
-----
"A fool and his money are soon parted." --- Thomas Tusser
Edits: 03/28/22
"Did you do a controlled blind test to establish your preference for vinyl playback?"
A preference is a preference...... I mean, I prefer wine over beer..... How does someone "test" that?
Yes. I think that is why I got banned. That did not fit their narrative and it destroyed their tired mantras. Very uncomfortable for them. Ironically you do not have t do debts to be allowed to express a preference for digital media over vinyl. Very hypocritical of them
I lasted approximately one week. That was last year. But for me it was a very fun filled week.
who the hell has the time and resources to do something like that at home, is that the requirement?
If so I'm not going there.
I personally think it's a waste of time trying to appease this community, just for their "acceptance"...... (As if their "acceptance" validates one's preferences..... Preferences need not be validated.... A person likes what he likes.)
I don't even know what the point of this is.....
It is different if one is *marketing* a product, advertising superior performance, and should show controlled testing and results to back it up.
Of course you can prefer one thing to another. I have the Mozart Requiem with Colin Davis conducting the BBC SO and the John Alldis Choir on LP and CD. I prefer the CD version. Of course, this is a sighted preference. There is nothing wrong forming preferences based on sighted listening. I have not tried to do a blind test to establish my preference is based on the sound alone.
It can go the other way, of course. I have the set of Sibelius Symphonies with Lorin Maazel conducting the VPO on CD. I have some of them on LP, and the LP versions, which sound better to me. I quite enjoy the CD versions, nevertheless. I have not bothered to do a blind test to see if my preference is based on the sound alone.
On LP, I have recordings of the Debussy and Ravel String Quartets with the old Chicago Fine Arts Quartet. I have great recordings of them on CD by the Quartetto Italiano. I prefer the recordings with the Fine Arts Quartet. I have not established those preferences under blind conditions, which would be very difficult, since I am pretty sure I could tell when the LP was playing.
-----
"A fool and his money are soon parted." --- Thomas Tusser
I agree that preferences are inarguable. The question being raised by many of the fair minded folks at both forums is what effect does bias have on preferences. I think it is fair to say they always have some effect when the person making the comparison knows what they are comparing. The thing sit eh greater the actual differences between two things the less impact bias has overall in the preference. As the two things being compared become more similar the more bias effects have an influence on the preference down to the point where the audible differences don't exist and the preference is 100% bias effects.
You can't do a blind comparison between beer and wine because they are so different that one would almost never fail to clearly correctly identify which is which under blind conditions. But as it goes the greater the real differences the less the preference is affected by biases. So ultimately when blind protocols won't work anymore because the differences are so easily identified the need for bias controls no longer exist
For many things it actually isn't that difficult. But they selectively enforce it based on whether or not your opinions fit their narrative. You can make all the claims you want about sound quality without doing any blind testing if the claims fit their collective belief system
Nt
Nt
There are two parts to blind testing
Validity: A system under test needs to be as close as possible to how that item under test is used in the real world.
A Mono speaker in the middle of a room on a shuffler with listeners who are "trained" by the testers (a corporation "Harman" - who are in the business of SELLING you speakers) on what to hear/think is not REMOTELY the same as the end-users who are buying stereo speakers for their room in their home.
The tests miss the mark of Validity by an ocean. Validity was also referred to as the "range rule" back in the day. This is what happens when you let engineers run tests that should be run by psychologists.
Reliability: This is how repeatable the results are - the number of trials that support the result. Again most DBTs fail here because they often use the 0.05 level of Statistical Significance so if you can select A over B 9/10 times then you have "proven" that you can distinguish A from B better than chance, therefore; you can tell the difference.
With audio (unlike medical where the DBT began) - it is profoundly different than a physiological reaction to a drug. Audio requires an ear/brain interaction which layers a slew of psychological issues. At the very least more trials would be needed to improve the reliability factor.
Statistically, 6/10 ten times with one miss for a 59/100 score would ALSO mean that the listeners meets the 0.05 level of significance meaning that this person can tell A from B to the SAME standard as the person who got 9/10. So a person who gets 6/10 is deemed a failure by the tester - when that's not the case. It's bad science to draw that conclusion.
Ultimately, there is also the "Conflict of Interest" issue here - if a major company selling the product makes up the tests and draws a conclusion that 'naturally' concludes their product is the best product - does anyone just take this at their word? Big Tobacco concluded their cigarettes were safe - the same scientists are the ones who say MMCC is not real. See "Merchants of Doubt" for actual internal documents from Exon and the like whose own scientists brought it up and then were shut down.
Why would anyone take Harman International at their word? Single speaker on a shuffler - I mean seriously?
You fault the HK methods as lacking validity because the listening circumstances are different than the typical listening situation, which is certainly true. Your complaint is actually about **ecological** validity. It's an empirical question as to whether the speakers and frequency response curves that listeners prefer in the KH situation are what listeners would prefer in a more "normal" environment.
Are there any tests of ecological validity? I don't want to assume it doesn't exist just because the HK test situation differs from the normal living room. Without this experiment, don't assume the HK data is meaningless.
Concerning reliability, you don't need to increase the number of preference choices ("trials") per session. Rather, you need more participants. This will increase the statistical power of the research. Ultimately, you are looking for proof of test-retest reliability, so you want the SAME group of research participants to do the ratings on two separate occasions.
P.S. I **am** a psychologist and not an engineer.
I think the issue is as you put it ecological validity. It's not that the HK tests are worthless but their tests don't really relate to what normal people do when they listen to music. The main problems in the short form are:
1) people buy two stereo loudspeakers and listen in stereo (or multichannel if H/T
HK's test is in mono - one loudspeaker in the center of the room (stereo speakers are set up to the sides of listeners no directly in front of them
2) HK has a conflict of interest - they are in control of the test and control the listener training telling people what to listen to and for. People selling you the speakers and conducting tests to ensure their product wins said test is problematic.
3) with the above two problems the reliability is largely worthless because it is apples from a sour tree.
You can test 10 sports cars on rollers to show how great they are in terms of speed and acceleration and get statistically reliable results that they all beat up on a Toyota 4Runner but if the real-world driving is driving up a pothole-filled mountain - then the reliable results are kind of worthless - Ie; the results have to mirror what it is you are testing.
So it is perfectly fine for HK to test say 10 speakers in their room and claim that X percent of people chose speaker A but it's not ok to extrapolate that result to suggest that those results will extend to stereo listening in a home. Or that it would extend beyond those other 9 "loser" speakers - see point 2.
I would far prefer Psychologists to run these tests if it were feasible. Professors often have assignments for students to conduct such tests with people who feel they have some ESP ability. We need an audiophile Psych prof to assign an audio DBT as I suggested somewhere here where it is more as you say "ecologically" valid. Seems like not a terrible idea for someone to have a Master's Thesis on it.
I don't think HK has a conflict of interest. Their interest is in selling speakers and if their testing allows them to build better speakers at each price point then the testing serves their interests as well as their customers
But I think where the commercial interests take precedence over the "science" is their assertion of universal applicability.That's where the "science" really is marketing.
I would think all speaker designers and manufacturers use some sort of listening tests to evaluate their products. And I don't think there is a cost effective and comprehensive way to do it. I think HK made reasonable commercial choices for a large scale opperation that needs large scale sales to succeed. Such a company has to appeal to a mass audience.
And the Revel line of speakers are really good. So their methodology seems to work even in the high end market. I just don't buy their sales pitch that their's is the "scientific way" and "the only legitimate way." It's one of a few legitimate ways to test and design loud speakers.
Scott wrote,"I don't think HK has a conflict of interest. Their interest is in selling speakers and if their testing allows them to build better speakers at each price point then the testing serves their interests as well as their customers."
> > > > At a minimum HK convinced you that their scam of claiming their speakers are "scientifically proven" by DBT to be better than Brand X. Same scam as The Amazing Randi only with HK it is under full support if a "real scientist," not some Las Vegas showman. It's another one of your logical fallacies that DBT helps HK build better speakers. In other words, a load of horse manure. You fell for their scam hook, line and sinker.
And, all those speakers that are better than HK, who knows how many, many hundreds one assumes, they didn't use DBT to improve their performance so what's that tell you how valuable a tool DBT is? If the rule you follow got you to this place what good is your rule? Lol
Edits: 04/08/22 04/08/22
RGA wrote: "So it is perfectly fine for HK to test say 10 speakers in their room and claim that X percent of people chose speaker A but it's not ok to extrapolate that result to suggest that those results will extend to stereo listening in a home."
I don't disagree, but what does "stereo listening in a home" mean? Does it mean good placement in a room with intelligent acoustical treatment (which the vast majority of people don't have), a glass-lined family room with a cathedral ceiling, a small apartment living room with carpet and lots of furniture including-- God forbid:) --a coffee table between the speakers and the listener? etc. etc.
RGA wrote: "2) HK has a conflict of interest - they are in control of the test and control the listener training telling people what to listen to and for. People selling you the speakers and conducting tests to ensure their product wins said test is problematic."
It depends. Biomedical researchers, including psychologists, have ways of controlling for this kind of bias (although it still occurs)
If the tests are actually conducted by people following a printed set of instructions who have no idea of what speakers are being tested, and if the researchers aren't keeping back disconfirming results, then the data isn't tainted. Also, replications by others are essential, and I don't believe that has occurred.
3) with the above two problems the reliability is largely worthless because it is apples from a sour tree."
Do we know whether the researcher are actually telling participants ("subjects") what to listen for? Or are they simply having the participants rate what they like? The idea of the HK curve is that it's supposed to reflect what people **like.**
Now, it's interesting that you mention psychology professors and ESP tests. Darryl Bem, a well-respected social psychology prof emeritus at Cornell, did a series of studies showing that (in his view) psychic phenomena exist. Do read this description of his results and think of alternative explanations for them:
https://news.cornell.edu/stories/2010/12/study-looks-brains-ability-see-future
Others failed to replicate his results. Bem then did a meta-analysis of replication studies and found support for his conclusions about psi:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4706048/pdf/f1000research-4-8494.pdf
I remain a psi-skeptic, but I am open to persuasive evidence.
Book recommendation: My friend Stephen Braude, Ph.D., is a professor emeritus of philosophy, musician, avid audiophile, and past president of the Parapsychological Association. His book "The Gold Leaf Lady" (link below) is an entertaining account of some of his personal explorations of "anomalous experiences," as some psychologists call them.
Absolutely a conflict of interest. Just like The Anazing Randi, who had a million dollar challenge for anyone who could tell the difference between high end $20,000 speaker cables and Monster Cables. He also had a million dollar challenge for anyone who could pick out which CDs had been treated with The Intelligent Chip. The Amazing Randi staff concocted a plan, a "protocol," of how the controlled blind tests would be run, such that there was not a snowball's chance in hell that The Amazing Randi was going to lose a million dollars. Are you kidding me? As I stated previous everyone who demands controlled double blind tests has an ulterior motive.
There are two studies. I personally find the one I read to have some pretty substantial shortcomings. But even more significantly I don't think their own data supports their conclusions
"Validity: A system under test needs to be as close as possible to how that item under test is used in the real world."That presents a little bit of a problem since audiophile systems vary widely in *performance* Who picks the ones that represents the real world? Aren't systems:inherently different with respect to a great many things - room acoustics, tweaks, or lack thereof, isolation from vibration, care with details, break-in of cables, fuses, etc.? Is it possible that audiophiles who care the most about controlled blind tests would be the *least likely* to benefit from whatever item was under test?
Edits: 03/28/22
The reason why "blind" testing is so controversial in audio is because it is never even done correctly......
If the test tests the listener, as opposed to the product, then it's wrong......
The ABX test is to help one *personally* determine whether he/she prefers a product..... It can be done in the privacy of one's own home, he/she either like the product or doesn't..... No harm, no foul.
The "double blind" test is what "audio science" types tout all the time, yet never do it correctly..... (It's ironic that the science of a true double-blind evaluation is totally disregarded.) It has nothing to do with "not seeing the gear"..... It doesn't even involve listeners comparing products..... It's basically how well a group of people takes to a component under evaluation relative to another group of people taking to a placebo. The "blind" in an actual double-blind study deals with the listener and test distributor not realizing that the product being evaluated could be a "placebo"..... But every listener is evaluating just one item, and making judgment on whether the sound is or isn't satisfactory.
I see no issues with the blind protocols used at Harmon. They are simple preference/rating evaluations. They are level matched as much speakers can be level matched and they are done with. Skyrim that hides the speakers from view.
To be "reliable" a test method only needs to be repeatable and replicable. That doesn't mean it is testing all the most critical factors which is what would make it "valid".
Dmitri Shostakovich
Would be to first attack the extreme ends of audio and get relatively unbiased listeners.
So you take out two hotel rooms of identical size and are deemed "good" rooms in terms of mirroring the average home-sized living room.
Room A: Has the best measuring gear that no one would question as being deemed good measuring - so the speakers where JA says something like "this is great engineering" ditto for amplifiers and say a CD player - whatever the best measuring SS amp/CD player that has ever existed.
All this in room A.
Room B has some well liked system with bad measuring amplifiers TUUUUBES or SETs(to make it even worse) then with well reviewed well liked speakers that don't measure too well and the worst measuring CD players (Non Oversampling with TUUUUUBES - gee sounds like Audio Note - but whatever - a horn etc.
Then you black out the two rooms such that no one can see the stereo - you play at say 75dB at the listening chair. The same CD playing the same music in both rooms on repeat.
You go to the local university - you get 30 students in the music program who are young enough to still be able to hear stuff properly.
You pay them $40 and offer a free lunch over several days - one at a time they listen to the two rooms as much as they like - they each have a card - they each drop the card in the box of the room they felt resembled real music.
The reason these tests are not done is in the end they will still only tell you what other people heard and in those systems in those rooms.
Let's say the scoring after such tests were 25-5. That still doesn't help you because for all you know you hear it like one of the 5. The stats can tell you'd be more 'likely' to like one system over the other but that's about as useful as movies.
93% of the critics recommended Power of the Dog - I did not. 93% of listeners can like a Paradigm speaker - I am meh.
I'm not sure what that proves other than that movies based on popular books I've read, (I've read 'Dune' several times), makes me an unfair critic.
Same is true for the 'Lord of the Rings' and 'The Hobbit' series.
Toole proved, (yes, proved), that, under the constrained conditions of his testing, virtually everyone preferred accurate speakers. There isn't any need or benefit to pulling amplification into the discussion.
Not to fly off on the same tangent, but I have proved to myself over decades that I personally prefer accurate over euphonic amplification if I have accurate associated equipment and a top-notch recording.
Dmitri Shostakovich
At least you recognized the 'under the constrained conditions of his testing.'
You can test a car on rollers and prove that a Porsche is better than a Toyota Land Cruiser.
Then you can drive both vehicles off-road in the mountains and find out which one is actually better.
Validity matters.
.
Gsquared
I am very much a proponent of blind protocols for auditioning. And I do have an open invitation for anyone who is willing to bring either the JBL M2s or the Revel Salon Ultima 2s, the two crown jewels of the Harmon Kardon testing methodology to my home for blind level matched shoot outs.
Not to prove a point or win a pissing contest but to see if they really are better in my listening room.
some years ago.
He doesn't seem to understand the bias introduced by their fixed test procedure.
It seems alot of people who agree with him don't get it either. I think it is fairly obvious that it creates a sort of "home field advantage" in that there is no consideration of optimizing the positioning or the room for competetive models.
The irony of it is that if the tests were so universally transferable to stereo playback under so many different conditions it wouldn't "ruin" the test to try to optimize competetive models by adjusting their position in the shuffler. And yet doing that is seen as doing the test wrong.
they did have a short skid on whether a Stradivarius has a superior sound to a modern violin. The subject thought one clearly was superior, but it wasn't the Strad, but he thought it was.
with decided bias introduced. I commented on this many years ago...
Click here. Post includes link including comments and restrictions by one of the performers.
A friend of mine who is the assistant principle cellist with the San Diego Symphony Orchestra has two cellos. A very pricey one that she uses for concert during the indoor season in Copley Concert Hall. The less expensive one for the out door season. She doesn't want to risk damage to her expensive cello out doors.
So of course I asked her if the difference in price made for a substantial difference in sound quality. Her answer was one that I never saw coming. She did not feel the more expensive cello inherently sounded better. The difference was that for her as a musician it was easier for her to get it to sound it's best. The cheaper cello was simply more difficult to get to sound as good.
So the better cello allows her to focus more on performance and less on the instrument. I don't know how anyone could do scientific studies on that aspect of sound quality. It is a function of the instrument/musician feed back mechanism.
This is not to say that musical instruments do not have differences in sound quality. Clearly they do. But it is complicated by the "playability" of the instrument.
all the teachers I worked with had a particular choice of instrument they faithfully used all the time. A woodwind teacher had a USA Selmer Mark 6 tenor sax and said it was the best made.Reminds me of the Humble Pie song "Someone stole my axe today, best one that I had"
edit - once after the teacher had the sax re-padded he played it for me, and I'm standing in his lesson room listening to him do a few runs up and down. It was extremely loud&powerful and vibrated the air so much it was stinging my face. That's about the best I can describe how it was. People here don't understand just how hard it is to duplicate this with ANY system. It won't ever happen or compare to the real thing.
Edits: 03/30/22
Since I believe the sponsor of all this was The Harman group. I don't know of any hifi "Harmons", although the late Sidney Harman was a force to be reckoned with (love him or hate him). You might be conflating him with his long-ago colleague Bernie Kardon , but it is a rather conspicuous gaffe.
all the best,
mrh
Edits: 03/27/22 03/27/22
My bad. You are correct
What does HK refer to in this context?
Hope you get some answers to your questions.
I have none to contribute.
"Once this was all Black Plasma and Imagination" -Michael McClure
Harmon Kardon. I probably should not have assumed everyone would know. Sorry
I did get to do an amazing blind comparison in Hong Kong between a Rockport Sirius III TT and a Forsell Air Reference both fitted with Van den hul grasshopper cartridges
A most enlightening experience
what happened?
Much to my surprise the Forsell sounded better. For me that was the first objective evidence of the value of euphoric colorations. The Rockport was designed to be as uncolored as physically possible. The Forsell was largely designed by trial and error based purely on subjective evaluation with no consideration of it's objective measure of audible distortion. The more colored TT was also the subjectively better sounding one. I bought a Forsell based on that audition and changed my philosophy regarding technical accuracy and it's function as an indicator of subjective sound quality
you don't say.... I've felt the same for years.
A lot of people clamp the record to the platter. I say, bunk, let it ride. Unless it has a large warp in which case replace the record. It's interesting that with analogue you get to play with colorations and maybe even find bliss, but not with digital unless you consider being able to change cables or such. Resistors and caps nowadays are too small.
Someday, someone will have a device with one knob that takes you from one extreme to the other.
But now you can play in digital with DSP plugins. I have adjusted my philosophy to the idea that accuracy is a good starting point and added colorations take us the rest of the way
"accuracy is a good starting point and added colorations take us the rest of the way", each to their own personal destination.
Tre'
Have Fun and Enjoy the Music
"Still Working the Problem"
yes, and colorations to taste and system/room. Probably why so many are never satisfied and constantly "upgrading"
No sweat, figured it was either that or Hong Kong, which didn't fit at all.
"Once this was all Black Plasma and Imagination" -Michael McClure
enn tee
all the best,
mrh
Yup. I should spelled it out AND spell it correctly
FAQ |
Post a Message! |
Forgot Password? |
|
||||||||||||||
|
This post is made possible by the generous support of people like you and our sponsors: