|
Audio Asylum Thread Printer Get a view of an entire thread on one page |
For Sale Ads |
80.66.130.220
In Reply to: A lot of people don't understand dither posted by Christine Tham on March 1, 2007 at 14:03:41:
What Todd correctly points out is that if the original signal contains wideband noise, with a suitable distribution, and at a level at or above the target word length's equivalent noise floor, then the signal is self-dithering and truncation to the target length can be done without additional dither.In real-world music productions the total amount of noise originating as thermal noise prior to the ADCs, in the final digital-domain high-resolution master more often than not exceeds the 20bit -120dB noise floor and truncations to word lengths of 20 bit or more don't need dither.
Your 24bit/16bit picture analogy is not quite valid as truncation to 16 bit (i.e. 5/6/5 bits for R/G/B respectively) makes the target noise floor exceed that part of the noise floor in the original image that could have worked as self-dither. In fact, as image sensors offer 60-70dB (10-12bit) of original SNR (10-12bit) in the dark and are shotnoise-limited in the light one can safely say that in almost all applications the original non-correlated image noise lies way below the quantisation noise floor of the 8/8/8 '24' bit format and thus will never serve as self-dither.
If the original raw image does contain visible noise (and trust me, it will ;-), then keep in mind that this high-level noise originates as shot noise or as image processing truncation noise and thus is correlated and non-dithering.
Follow Ups:
*** What Todd correctly points out is that if the original signal contains wideband noise, with a suitable distribution, and at a level at or above the target word length's equivalent noise floor, then the signal is self-dithering and truncation to the target length can be done without additional dither. ***Typical ambient noise levels captured by the mic is not sufficiently random and also too high to produce self dithering (you can analyze the spectrum to see what I mean). However, the self dithering comes from the decimation process in a sigma delta ADC. In that respect, what Todd originally said is strictly speaking not correct.
As for truncating to the target length, if you do that without dither it will not be as good as using dither, as per my analogy with reducing the color depth of a picture.
Again, ambient noise levels are not necessarily random in either amplitude or spectral content - hence the phenomenon of our ears apparently being able to hear "below the noise floor".
*** Your 24bit/16bit picture analogy is not quite valid as truncation to 16 bit (i.e. 5/6/5 bits for R/G/B respectively) makes the target noise floor exceed that part of the noise floor in the original image that could have worked as self-dither. ***
Try it yourself. Select an image (you can use a noisy image if you like). Using software of your choice, reduce the color depth from 24-bits to 16-bits with and without dither and then compare the results. Your eyes will notice banding if dither is not used.
Hint for generating an image with a lot of color noise - try shooting with a high ISO (which amplifies the CCD signal) in very dark conditions using long exposure. You will find the noise levels to easily exceed 16-bit (5/6/5) resolution. And yet, dithering from 24 to 16 reduces banding. This is independent of the noise level of the original image, as I originally pointed out.
The analogy back to audio is that dithering will reduce the "banding" caused in quantization noise caused by truncation regardless of the noise floor of the original signal.
"Typical ambient noise levels captured by the mic is not sufficiently random"I didn't say 'ambient', I said 'wideband noise, with a suitable distribution'. How about the thermal noise generated in all (equivalent) circuit resistances and capacitances right from your nice tubed large-diaphragm Neumann through the preamps, mixer, ADC front-end and ADC comparators. And this for 24 or more tracks summed. It is wideband, it is near-Gaussian, and it is present in almost all commercial recordings. And it dithers very nicely, thank you. (Try it!)
Have a look at the self-noise of your own minimalist recording setup. Ask yourself why it is not at -144dBFS, even with the microphone replaced with a short and all 60Hz harmonics discarded.
"However, the self dithering comes from the decimation process in a sigma delta ADC. "
My assertions are entirely independent of the architecture of the quantiser.
"Again, ambient noise levels are not necessarily random"
Again, I was not referring to ambient noise.
As for hearing below the noise floor, this is a sad mis-nomer. Yes, our ear can hear below the *noise level* (insofar no maskers are present), but no, we cannot hear below the *noise density floor*.
Try it: take a white noise density floor and then track a 3kHz fade-out into it while keeping an eye on the signal spectrum AND on the summed level (integral of noise density).
You can do this at any level. You'll find you can resolve the sine down to 30dB or so below level, but you'll also find you'll lose it right when hitting the density floor.(However, it would be interesting to try this with stereo noise and the sine panned dead-center. I imagine one can get a couple dBs lower then.)
"reduce the color depth from 24-bits to 16-bits with and without dither and then compare the results. Your eyes will notice banding if dither is not used."
Well, that's what said. Your analogy is wrong because in the case of truncating an image from 8/8/8 to 5/6/5 the target quantisation noise floor is so high that there is no dither-able noise present in the image at that level. Most of the noise visibly present in the source image is correlated and/or of insufficient bandwidth (shot-noise, algorithm quantisation and errors, dark current, and the shot-noise of the dark current ... none of them remotely gaussian). That's why you need to add external dither.
But not necessarily so in audio where we know that the presence of thermal noise at about -120dBFS (or higher) *per track* is a given. Whether a fiinished recording contains enough such noise to be effective as dither for 16 bit output depends on the structure of that recording. But many real, commercial, non-minimalist, recordings seem to qualify for this.
Not that this matters a lot.
*** How about the thermal noise generated in all (equivalent) circuit resistances ***I've already covered this in my other post.
"A good mic, mic pre amp and analog console have dynamic range of around 130dB. A good ADC will have dynamic range slightly exceeding 20 bits.
So, the noise floor of electronics alone are too low to produce self dithering when truncating from 24 to 20."*** Have a look at the self-noise of your own minimalist recording setup. ***
Yes I have. I have two mixers, 4 sets of 24-bit ADCs, a few amps, mics, lots of keyboards (I did my thesis in computer music) so it's hardly "minimalist". I have also used a few studios.
*** Your analogy is wrong because in the case of truncating an image from 8/8/8 to 5/6/5 the target quantisation noise floor is so high that there is no dither-able noise present in the image at that level. ***
The analogy is apt because I am asserting (based on my experience) that there is usually no "dither-able noise" present above 20 bits in audio, therefore there is a benefit to dithering if truncating to 20 bits.
*** Most of the noise visibly present in the source image is correlated and/or of insufficient bandwidth (shot-noise, algorithm quantisation and errors, dark current, and the shot-noise of the dark current ... none of them remotely gaussian). That's why you need to add external dither. ***
Your argument applies equally to audio. That's why it's a good analogy. As I've mentioned in my other post, you can do the measurements and check it out yourself. The "thermal noise" that you speak of is typically not at -120dB, it's much lower than that (at least on my equipment anyway). What's at -120dB are harmonic spikes which is insufficient to cause dithering, and will *benefit* from being smoothed out through dithering. :-)
Put it this way, if the statement was whether there is any benefit in dithering when truncating to 22 bits (instead of 20) - I would say "probably not". But those 2 bits make a big difference, because that's where the "wideband" noise you speak of resides.
"A good mic, mic pre amp and analog console have dynamic range of around 130dB. A good ADC will have dynamic range slightly exceeding 20 bits.
So, the noise floor of electronics alone are too low to produce self dithering when truncating from 24 to 20."Please read back and consider the case that I outlined, where 24 or more tracks have to be mixed to a final stereo master. Or do you deem this not representative for the present state in the music industry? Even when all individual tracks are at around 20b dynamic range, their sum will have a higher noise floor unless massive-scale noise gating is used during mixdown (as in the old days).
There are a handful of ADC chips that resolve near to 120dB of dynamic range (a toddler's hand at that). Far from all convertors in present-day studio gear use these chips. Many use ADCs limited at 17-18b performance.
You seem to be focusing on the best case scenario. It may exist (would you care telling me which ADC you use, or provide me with a spectrum plot of you system's noise floor with shorted input?), but it is not quite typical for the general state of affairs, unless I'm grossly mistaken.
I'm sorry, but you are trying to change the subject.To quote the original statement made by Todd:
"This is also why applying dither noise isn't necessary when truncating data of 24 bits or more down to 20 bits or more. The ambient noise alone would be adequate for such application."
As I've shown, this statement is NOT true (at least not generally). Ambient noise alone is not adequate (and I think we both agree on this).
You seem to be arbitrarily trying to limit the scope of the discussion to specific situations such as poor quality DACs or noisy preamps or mixing a LOT of tracks. Whilst we can argue on and on about specific cases (and whether they represent a "norm" or not), the original statement without qualifications is not factual. Remember, it only takes one counter example to disprove a generalisation.
Even if we wanted to discuss your specific cases, your examples are not valid.
Let's consider your example of mixing 24 or more tracks. You said:
*** Even when all individual tracks are at around 20b dynamic range, their sum will have a higher noise floor unless massive-scale noise gating is used during mixdown (as in the old days). ***
I suspect you don't have a lot of experience mixing. If you mix a lot of tracks, the final mix will exceed 0dBFS unless you attenuate the levels of individual tracks. Therefore the noise floor does not necessarily increase. For example, let's say I have 2 tracks (both with peaks just under 0dBFS). The overall mix will have peaks potentially at +3dBFS unless I attenuate each track by -3dB. Doing so reduces the noise floor of each by -3dB, so the noise floor of the mix is actually at the same level as for the individual tracks.
In fact what typically happens in a mixer is by the time you attenuate the levels of individual tracks, noise levels below 20 bit of the individual tracks are in danger of being "truncated" out of the mix. For example, if we assume noise is at the 20 bit level, attenuating the track level by say -20dB (not atypical for a multi-track mix) causes the noise level to be now at 24 bit level where it is in danger of being truncated out of the final mix altogether (assuming that the mixer operates at internal resolution of 32-bit floating point).
Dithering is advisable to reduce the quantisation noise generated by this truncation, even if the final mix is at 24-bit depth. So mixing actually makes it even more neccesary to apply dithering.
*** Many use ADCs limited at 17-18b performance. ***
No. You are confusing between the measured dynamic range of an ADC and the actual level at which the noise is sufficiently "dense" or "wideband" (your terminology) to act as an effective ditherer. The laws of physics does not change depending on the quality of the ADC. An ADC with a dynamic range of say 102dB (corresponding to your "17 bit" of performance) will have a "thermal noise" envelope far lower than -102dB. Measure it yourself. What you see around -102dB are harmonic spikes that are not sufficient to cause dithering.
"Measure it yourself"I did. Do you think I'm stupid? Do you think I don't know the nature of noise?
Best I ever got was a PCM1804, which after having discarded
any structural noise left me at -108dB rms. Going to evaluate
the new PCM4202 in the coming weeks, I hope.
It was fun for a while, but now I'm signing off, ma'am-your-honour.
*** Best I ever got was a PCM1804, which after having discarded
any structural noise left me at -108dB rms. ***Care to show us a graph of your results?
I checked the data sheet for the PCM1804, and the noise curves on page 11 and 12 (Figures 8-15) clearly show that in PCM mode for single rate and dual rate, the noise levels are close to -140dB apart from a few spikes (which are not ditherable, but these spikes lower the dynamic range to 112dB). In DSD mode, the noise levels are just under -120dB. Even in quad rate mode, the noise levels are around -140dB below 48kHz.
For the PCM4202, the corresponding figures are on page 9-10 of it's data sheet, and show underlying noise to be very similar, just under -140dB. The main difference between the two chips are the lack of harmonic spikes on the PCM4202 (as it's a higher quality ADC), which results in better SNR and THD measurements.
This is consistent with my own measurements of similar ADCs. The laws of physics don't change for different ADCs.
If your results are wildly different, it either means what you are measuring is broken, or there's something screwy with your measurement method.
"I checked the data sheet for the PCM1804, and the noise curves ... clearly show that ... the noise levels are close to -140dB apart from a few spikes (which ... lower the dynamic range to 112dB)"No. The curves show the noise DENSITY, which is, by definition, the specific noise power measured over a unit bandwidth of 1 Hz.
In order to obtain the noise LEVEL (which is what you need for SNR and dynamic range calculations, and which is also what figures in the infamous SNR = 6 x NumberOfBits shortcut), you have to integrate the DENSITY over BANDWIDTH.
For a white noise spectrum with given noise density of Vn (dB) the resulting (RMS) noise level is then Vrms = Vn + 10*log(bandwidth), all in dB.
An ideal 16 bit system with 96 dB SNR over a bandwidth of 22kHz (CD) then has its density at -96-10*log(22000) = -139dBFS.
My PCM1804 with measured 108dB SNR over a bandwidth of 48kHz (running the recorder at 96kHz), has its density at -108-10*log(48000) = -155dBFS.
So if you measure -130dBFS for your system, then the equivalent RMS level (assuming 22kHz BW) is -87dBFS and that's 14.5 bits ;-)
But don't worry, chances are that if you were only looking at FFT-generated plots of the noise density floor your 'measurement' was wrong anyhow, as the plotted level depends on the size of the FFT itself. Try it. Take a white noise signal into Audition and take its spectrum with 1024, 4048, 32000 point FFT ...
The plots in the PCM1804 datasheet are made with 8192 point FFTs. The resulting levels don't show the true per-1-Hz noise density, but rather the density per frequency-bucket of Fsample/8192 Hz. So the floors in the plots are artificially high. That is OK, because to a skilled engineer the 'N=8192' tag in the plot says all.
Included link may help.
- http://lad.linuxaudio.org/events/2004_zkm/slides/saturday/fons_adriaensen-jaaa.pdf (Open in New Window)
... sit down and think about what you've said.What does this say about "noise levels" required for "self-dithering"?
Think about it *really* hard.
PS - I never did say my *system* measured -130dBFS. I said analog circuits can exceed 130dB dynamic range. Not quite the same thing. As for my "measurements" being "wrong", I don't recall giving you any noise level measurements at all. But I'm glad you got really excited.
A good mic, mic pre amp and analog console have dynamic range of around 130dB. A good ADC will have dynamic range slightly exceeding 20 bits.So, the noise floor of electronics alone are too low to produce self dithering when truncating from 24 to 20.
Let's consider the case of "self dithering" caused by ambient background noise during recording. As far as I know, this is a "myth" and I'm not sure many engineers still believe in this. Let me explain why.
Typical peak levels in a live recording: 120dB SPL. Typical ambient noise in a live recording (assuming a concert hall) - around 40-50dB SPL. I have measured far worse, often due to air conditioning.
Typical peak levels in a studio: 100-110dB SPL. Typical ambient noise in a quiet studio: 20-30dB SPL. Again, I have measured studios that are worse (well, mine for instance :-))
These are all numbers based on my experience. Feel free to validate them yourself.
Assuming that we set the levels such that peak is close to 0dBFS, then typical noise level in a recording: around -70-80dB.
You can actually validate this yourself by doing an analysis of commercial recordings. The majority of them will exhibit ambient noise levels at these levels.
These levels are clearly too high to produce a self-dithering effect when truncating from 24 to 20. Particularly when you analyze the spectrum of the ambient noise and realize it is not random.
And in any case you can prove it to yourself. Take a recording with ambient noise at say -70dB. You will find you can easily hear a reverb tail even when it's amplitude is below -70dB.
Even in the best case scenario, noise levels may be at -90-100dB. This may potentially produce a "self dithering" effect for a 16-bit recording, but it won't for a 24-bit recording (with between 20-22 bits effective resolution).
You don't have to believe anything I say, but the numbers kind of speak for themselves, and they are numbers you can easily validate yourself.
"Even in the best case scenario, noise levels may be at -90-100dB. This may potentially produce a "self dithering" effect for a 16-bit recording, but it won't for a 24-bit recording (with between 20-22 bits effective resolution)."I think you meant to swap the numbers: This may potentially produce a "self dithering" effect for a 24-bit recording, but it won't for a 16-bit recording.
Note that nobody is denying that applying dither noise *is* required if a high-resolution audio signal is truncated to 16 bits. But I don't think adding noise is necessary if the same high-resolution signal is truncated to 20 bits.
And comparing dither needs of audio signals with the dither needs of video signals is comparing apples with oranges. Note that video color is not something that is hurt by dynamic range, since the color range is finite. Low rez video color deals with wider spacing between adjacent colors. Which is a lot more benign than such application applied to audio signals.
*** I think you meant to swap the numbers ***No, read it again. The statement is as intended.
*** But I don't think adding noise is necessary if the same high-resolution signal is truncated to 20 bits. ***
It doesn't matter what you think, in reality dithering at 20 bits will make a difference, just like dithering at 16 bits. It may be harder for our ears to hear the difference though.
It's probably a moot point though because I can't think of a valid reason why someone would want to truncate a 24-bit recording to 20-bits. Do you?
*** Note that video color is not something that is hurt by dynamic range, since the color range is finite. ***
I'm not sure what you mean here. Try comparing an image with 8 bit colour vs 16 bit colour vs 24 bit colour and tell me whether it's "hurt" by dynamic range or not.
*** Which is a lot more benign than such application applied to audio signals. ***
Assuming that we set the levels such that peak is close to 0dBFS, then typical noise level in a recording: around -70-80dB.You can actually validate this yourself by doing an analysis of commercial recordings. The majority of them will exhibit ambient noise levels at these levels.
These levels are clearly too high to produce a self-dithering effect when truncating from 24 to 20.
The noise level does not have to be near the quantization noise floor. It only has to be _above_ the quantization noise floor. Dither levels are deliberately set near the quantization noise floor so the dither doesn't reduce S/N too much.
Particularly when you analyze the spectrum of the ambient noise and realize it is not random.It doesn't have to be random. It merely has to be uncorrelated with the musical signal and above the quantization noise floor in level from 0 to fs/2.
*** The noise level does not have to be near the quantization noise floor. It only has to be _above_ the quantization noise floor. ***The point is that there are NO effective dithering sources above 20 bits.
We already know that there is a benefit from dithering when truncating from 24 to 16 bits. Not only is it possible to hear the difference between dithering and non dithering, but it's possible to hear the differences between dithering algorithms (see for example the results of the Great Dither Shootout).
The reason I made a point about background noise typically being at the -70-80dB level is that this is ABOVE the quantisation noise floor of a 16 bit recording. Therefore, if it does not eliminate the benefit of dithering at 16 bits, it also does not eliminate the benefit of dithering at 20 bits.
Otherwise there would be no benefit recording at a resolution higher than about 14 bits.
*** It doesn't have to be random. ***
Note: just because I pointed about that ambient noise is not random, doesn't mean that I am asserting that an effective dithering source has to be random. In any case it is easy to demonstrate that we can hear well below the so called "noise floor" of a room.
This post is made possible by the generous support of people like you and our sponsors: