|
Home
/ FAQ
/ News Classifieds / Events |
Audio Asylum Thread Printer |
Get a view of an entire thread on one page |
87.185.72.8
| '); } else { document.writeln(''); } } else { document.writeln(''); } } else { document.writeln(''); } } // End --> |
In Reply to: RE: statistics question posted by mike1127 on June 25, 2009 at 11:17:03
> My understanding is that we have a "null hypothesis" which is that the
> cables can't be told apart. If I do well enough then we can reject the
> null hypothesis with a certain level of significance.
This is not correct. The procedure you describe is to test the hypothesis that you can tell the difference between your interconnects to a given level of confidence. Failing this test does not mean you cannot sometimes tell the interconnects apart. It means that you cannot reliably tell the interconnects apart.
In order to test that you cannot tell the interconnects apart at all you can use the same results but have to perform a different analysis to test a different hypothesis. The hypothesis you would want to test is: are my results simply a set of random guesses to a high level of confidence? This is a standard test but not one I have seen audiophiles perform which is perhaps not surprising. It would be used to test if, for example, a dice was fair or if a statistical measurement technique was unbiased. As a test it requires more samples for a given level of confidence because one needs to test the whole probability distribution rather than simply the mean. For example, if you took a 100 measurements and perceived a difference in the first 50 and perceived no difference in the last 50 then although the mean is the same as the average of a random one the distribution is clearly not random.
> My question now is: let's say I'm able to get a correct identification
> 80% of the time. How many trials would be needed for a 5% level of
> significance? For a 1% level of significance? For a 0.1% level?
If you can only tell the cables apart 80% of the time then it does not matter how many samples take since you are not drawing from a population that can tell the cables apart. You would need to change the hypothesis being tested to something else.
"""This is not correct. The procedure you describe is to test the hypothesis that you can tell the difference between your interconnects to a given level of confidence. Failing this test does not mean you cannot sometimes tell the interconnects apart. It means that you cannot reliably tell the interconnects apart."""
Well, that's not what my little primer on statistics says. There is a null hypothesis (H_0), alternative hypothesis, (H_1). If the sound of the cable has no influence on my perception, then the probability of guessing correctly is p=0.5. In other words, completely random. That's the null hypothesis. If we run 18 trials and I get 15 of them correct, then P=0.004. That means there was a 0.4% chance that p actually equals 0.5. There is a 99.6% chance that p is *something other than 0.5*. But the test cannot tell you what p is. Just that it's not 0.5.
Guessing right 80% of the time, over a large number of trials, is in fact strong evidence that my perception is influenced by the sound of the cables. It may also be influenced by what I had for lunch, or what random thoughts are going through my brain. But it is very likely influenced by the difference in sound---hence evidence that I can tell them apart (i.e. that their sound influences my perception).
"Well, that's not what my little primer on statistics says. There is a null hypothesis (H_0), alternative hypothesis, (H_1). If the sound of the cable has no influence on my perception, then the probability of guessing correctly is p=0.5. In other words, completely random. That's the null hypothesis. If we run 18 trials and I get 15 of them correct, then P=0.004. That means there was a 0.4% chance that p actually equals 0.5. There is a 99.6% chance that p is *something other than 0.5*. But the test cannot tell you what p is. Just that it's not 0.5."You mostly have it except for the meanings of the various probabilities.
p=0.5 ... that just means the probability of guessing correct on a single independent trial.
P=0.004 ... that just means the probability of getting 15 correct out of 18 trials due to chance alone.
And, as per my other (longer) post, the test does not involve rejecting small "p", that's just an attribute of the mathematical model associated with the null hypothesis, what we are really doing is comparing our result with what the model "says" about the likelyhood of such a value.
Everything matters, don't forget to tweak your placebos!
Edits: 06/25/09
Post a Followup: