PDA

View Full Version : OT: bit-depth theory question



Dave Labrecque
12-21-2005, 01:31 PM
Exposing my technical naivete, but I've always wondered about this (I have no formal audio engineering training). If the audio doesn't dip to the bottom end (last few bits) in volume during the playback of a recording, how is it that 24 bits sounds better than 16?

If the answer is that the greater resolution extends the dynamic range AND increases the "grain density" of the "dynamic volume change increments" (sorry for the bad lingo), then how was it decided how many of the added bits go toward the former and to the latter?

Or is my thinking way off?

TotalSonic
12-21-2005, 03:06 PM
Dave -
Even during areas of continous high average level the wav forms start from a zero point (unless there's DC offset). So the extra dynamic range into well below the noise floor that 24 bit gives you should still theoretically sound "better" than 16bit even during places where things have a high average level. How much this theoretical difference is audible to the listener has been subject of much debate.

Best regards,
Steve Berson

Bob L
12-21-2005, 03:14 PM
Don't forget... it all becomes analog when it goes thru the D to A filters... smoothing out all the low level resolution differences... in the end... i think more damage to the overall sound quality is done by floating point math issues than by 16 to 24 bit resolution differences. :)

Then again... perhaps I'm the only guy who feels that way in the whole industry... who knows... I only know what I hear and I am certainly satisfied with the results SAWStudio gives me at 44.1k 16 bits with no dither... oh well. :D

Bob L

Cary B. Cornett
12-21-2005, 03:57 PM
If the audio doesn't dip to the bottom end (last few bits) in volume during the playback of a recording, how is it that 24 bits sounds better than 16?

The answer is in greater resolution of detail. A good analogy might be in the color resolution of computer graphics. The first color graphics adapters for computers had a very limited number of colors available on their "palette". This number started at 4 (!), then went to 16, 32, 256 colors. Most of us now have our monitors set for at least 16,384 colors. If you had a nice color photo in your computer, and you saved in in a form limited to 256 colors (I believe you can do this in Photoshop), you would see obvious "steps" in ranges of color change. The artificiality of the image would be unmistakeable. The selection of color resolution is done by limiting how many bits can be used to represent each color in each pixel (a pixel would be the visual equivalent of a single sample in an audio data stream).


If the answer is that the greater resolution extends the dynamic range AND increases the "grain density" of the "dynamic volume change increments" (sorry for the bad lingo), then how was it decided how many of the added bits go toward the former and to the latter?

The answer is that all of the added bits do BOTH things. Think of it as a staircase, with the top representing 0 dBfs (clipping) and the bottom representing the noise floor. Each bit we add doubles the number of steps in the staircase, with the noise floor represented by the height of the first step. Let's say the total height of the staircase represents the practical dynamic range.

The lower limit of that range is determined by quantization noise, and the math for finding that limit is very simple. If you multiply the number of bits per sample by 6, that's how many dB below full scale the quantization noise (or the dither needed to mask it) kicks in. For 8 bits the dynamic range is 8 * 6, or 48 dB. For 16 bits, it gets a lot better: 16 * 6 for 96 dB.

Now, a lot of older analog master recordings have a dynamic range of less than 70 dB. Going back to our staircase, we would represent the noise floor of those recordings as a lot of dust and dirt layered on the floor at the bottom of the stairs, and it is thick enough to actually cover some of the steps. From this point, the staircase can't really get much taller above the dirt, but the steps CAN still get smaller. By the time we get to a 24 bit resolution, the stairs seem more like a smooth ramp, with a fair amount of it reaching below the "dirt".

Now, the added smoothness that we get above the noise floor still improves the overall sound, so even though we can't really get much more dynamic range the added resolution is not wasted.

Hope I managed to clarify this for you a bit... :rolleyes:

Pedro Itriago
12-21-2005, 05:46 PM
Now that is one fine description

Leadfoot
12-21-2005, 05:51 PM
i think the most important factor is the higher math calculations at 24 bit allow you greater accuracy(quality) when doing heavy editing of soundfiles across the entire multitrack.

tony

TotalSonic
12-21-2005, 06:06 PM
i think the most important factor is the higher math calculations at 24 bit allow you greater accuracy(quality) when doing heavy editing of soundfiles across the entire multitrack.

tony

Tony -
Two totally different types of bits!
the 24 bits vs. 16 bits is regarding the measure of dynamic range encoded into the recorded PCM file.

The bits you are referring to is the amount of places available when doing internal processing math (which for SAW is 64bits fixed point, aka integer, math) after any changes (such as gain, fx, summing) is made. These larger temporary figures then spit out a resultant figure that gets passed to the next part of the signal chain - for most DAWs this is at a lower bit rate than the processing math (often in native DAW's as a 32bit floating point IEEE file). For SAW the resultant figures are at 32bit integer.

This resultant then must be played back through a DAC. The maximum a PCM DAC can output is at 24bits - but in real world terms even the best speced converters don't ever give more than say 22bits of reproduction, and even that is well below the noise floor of nearly every environment.

Just a note that through my own direct testing I disagree with Bob regarding the necessity of dither - I feel in order to achieve the best possible sound that use of dither when converting from 24bit to 16bit, while not a significant factor, will achieve much smoother sounding fades and tails. I'm going to post a blind dither shootout up here shortly so people will be able to make their own determinations.

Best regards,
Steve Berson

Leadfoot
12-21-2005, 10:12 PM
so are you saying that if i start with a 16 bit file and mangle the crap out of it with level changes and eq and whatever else i could do to it, it would be the same as if i started with a 24 bit file and edited it in 24 bit resolution all the way thru, the latter wouldn't have better results? it would have to be wouldn't it? at least i hear a difference. i remember when saw went to 24 bits, and i noticed a dramatic difference. sorry for the lamens terms, but i've been basing my whole digital recording theory on that for a long time.. if that's incorrect, i need to start smoking dope again, and just hit record like in the old days. anyways, thanks for trying to help. i understand that 24 bit has higher dynamic range, but i also thought that a more important factor was the higher math capabilities when doing heavy editing. 16 and 24 bit files (unedited) sound virtually the same, aside from slightly more headroom on the 24. but as soon as you start making changes to the files, that's where the higher math becomes important. or so i've thought. please correct me if i am wrong. how could the original bit depth of the file not matter?

tony



Tony -
Two totally different types of bits!
the 24 bits vs. 16 bits is regarding the measure of dynamic range encoded into the recorded PCM file.

The bits you are referring to is the amount of places available when doing internal processing math (which for SAW is 64bits fixed point, aka integer, math) after any changes (such as gain, fx, summing) is made. These larger temporary figures then spit out a resultant figure that gets passed to the next part of the signal chain - for most DAWs this is at a lower bit rate than the processing math (often in native DAW's as a 32bit floating point IEEE file). For SAW the resultant figures are at 32bit integer.

This resultant then must be played back through a DAC. The maximum a PCM DAC can output is at 24bits - but in real world terms even the best speced converters don't ever give more than say 22bits of reproduction, and even that is well below the noise floor of nearly every environment.

Just a note that through my own direct testing I disagree with Bob regarding the necessity of dither - I feel in order to achieve the best possible sound that use of dither when converting from 24bit to 16bit, while not a significant factor, will achieve much smoother sounding fades and tails. I'm going to post a blind dither shootout up here shortly so people will be able to make their own determinations.

Best regards,
Steve Berson

TotalSonic
12-21-2005, 10:52 PM
so are you saying that if i start with a 16 bit file and mangle the crap out of it with level changes and eq and whatever else i could do to it, it would be the same as if i started with a 24 bit file and edited it in 24 bit resolution all the way thru, the latter wouldn't have better results?

Sorry for the confusion - re-reading your post I think we were both misunderstanding each other's points.

All I was trying to point out was that the word "bits" can refer to 3 different resolutions in digital audio:
1) the amount of dynamic resolution being recorded to or played back from a static file (i.e. the difference between recording 16 or 24bit files)
2) the amount of decimal places internal processing math a DAW app routine has available before it needs to round or truncate the figure (in SAW's case 64bits)
3) the amount of dynamic resolution a temporary file that results from the processing has before it is sent to the next process in the signal path (i.e. the reason some workstations call themselves "32bit")

Anyway -
If you record a 16bit file it does not have the same dynamic resolution as a 24bit file. So theoretically recording a 24bit should sound better than the 16bit file. To my ear that are indeed slight but definite improvements in the sound quality by recording to 24bit rather than 16bits.

And yes - once you've started processing and summing files, unless the code is broken, then the higher resolution the internal processing math is the better the accuracy of the resulting figure after the processing is done. So I completely agree with you in this regard.

With SAWs math though both 16bit files and 24bit files are processed at the same internal math depth - and unless the multitrack is set to output at 16bit - any process that is done to the 16bit file (such as adding reverb or eq to it, or fading it out in the middle, etc.) will output at 24bits dynamic resolution. An unprocessed 16bit file is just padded with 8 bits of zeros if you play it back as 24bits though.

I just think it's helpful not to confuse the "bits" that describe a files dynamic range with the "bits" that describe the available level of internal processing math. Sometimes I think conversations like this would go smoother if there were actually more than one word to describe various similar but different things.

Best regards,
Steve Berson

paul kostabi
12-21-2005, 11:25 PM
math issues than by 16 to 24 bit resolution differences. :)

Then again... perhaps I'm the only guy who feels that way in the whole industry... who knows... I only know what I hear and I am certainly satisfied with the results SAWStudio gives me at 44.1k 16 bits with no dither... oh well. :D

Bob L

I am happy with this setting too. Just turn it on and go.
Hey Bob do you know a SAW studio (16 inputs) available december 29-30? Have a live project with a metal band and would love to do it in SAW, I am in LA area but can travel to LV with the group.

olzzon
12-22-2005, 01:00 AM
Don't forget... it all becomes analog when it goes thru the D to A filters... smoothing out all the low level resolution differences... in the end... i think more damage to the overall sound quality is done by floating point math issues than by 16 to 24 bit resolution differences. :)

Then again... perhaps I'm the only guy who feels that way in the whole industry... who knows... I only know what I hear and I am certainly satisfied with the results SAWStudio gives me at 44.1k 16 bits with no dither... oh well. :D

Bob L

No you´re not.
I Allways try to go for the first 90% before trying to fiddle with very tiny details that doesn´t do any large change.

Bob L
12-22-2005, 01:40 AM
Give Audiomation Studios a call here in Vegas... it's Arty Congero's studio... completely SAWStudio based... its where I do all my tracking that requires a studio.

I then usually bring the files home with me to do the mix on my own time.

Arty is a great engineer and you should be able to get a great sound by doing the whole project there. He is most known for his live engineer work with credits listing many famous rock artists as well as major industry names such as Tower Of Power, Earth Wind and Fire, Barry Manilow, Paul Anka, Doc Severensen... etc... the list is endless really. :)

702-656-1867

Bob L

Dave Labrecque
12-22-2005, 11:28 AM
Thanks, Cary. I guess what I still don't get is what determines the degree of "mathematical compression" of the bit range as it is applied to the dynamic range? Who "decided" that 16-bit resolution would be in x dB increments, and why did they decide on x? Why didn't they pick 2x? Or x/5? This "decision" determines both the total dynamic range and, inversely, the resultant real-world resolution of the audio, no?

I guess what you'll tell me is that no one decided it, that the mathematics determines it. But I just don't see why the dB resolution of y-bit audio wouldn't be an arbitrary, alterable figure.

I appreciate the education. :)

TotalSonic
12-22-2005, 11:39 AM
Dave -
A great article that further explains the issues behind bit depth and dithering very well is at
http://www.users.qwest.net/~volt42/cadenzarecording/DitherExplained.pdf

The decisions regarding how PCM was encoded was made by Sony/Philips during the mid to late 70's. Originally Philips was pushing for 14bits for CD's but luckily Sony won out for the 16bits we have now.

You can read an interesting history here -
http://www.opticaldisc-systems.com/2003sep-oct/Evolution42.htm
http://www.opticaldisc-systems.com/2003nov-dec/Evolution52.htm

Also - you might want to check out Ken Pohlmann's book "Principles of Digital Audio" if you want an in depth well written reference -
http://www.amazon.com/gp/product/0071441565/ref=pd_bxgy_text_b/104-9115722-7742320?%5Fencoding=UTF8

Best regards,
Steve Berson

Dave Labrecque
12-22-2005, 12:46 PM
Thanks for the link, Steve. Though I'd rather have it explained to me in 2 or 3 sentences (:p), I guess a little reading might be a good thing. :)

Cary B. Cornett
12-22-2005, 02:34 PM
Thanks, Cary. I guess what I still don't get is what determines the degree of "mathematical compression" of the bit range as it is applied to the dynamic range? Who "decided" that 16-bit resolution would be in x dB increments, and why did they decide on x? Why didn't they pick 2x? Or x/5? This "decision" determines both the total dynamic range and, inversely, the resultant real-world resolution of the audio, no?

I guess what you'll tell me is that no one decided it, that the mathematics determines it. But I just don't see why the dB resolution of y-bit audio wouldn't be an arbitrary, alterable figure.

I appreciate the education. :)

The dynamic range (difference between loudest and quietest possible sound) is measured in dB. We say that resolution is indicated by the number of bits . The more bits per sample, the finer gradations of level can be accurately represented. Perhaps part of the problem is that, while Level can be shown on a meter, Resolution cannot, other than by visually indicating how many bits are actually changing, which is useful in a very technical way (say, to a mastering engineer), but not very informative at the intuitive level.

Real-world dynamic range limits are usually set before we get to the A/D converters in recording. A lot of analog "front ends" have a higher noise floor than that of the converters they are connected to. Even with the best electronics, though, there is the noise floor of the recording space itself to deal with. I recently did a solo piano recording in a small theater, and when I got back to the studio I discovered that there was a surprising amount of environmental noise in the recording. It sounded OK, but the spectrum analyzer showed a fair amount of LF noise that undoubtedly included HVAC rumble and outside traffic.

So, for the most part the usable dynamic range is limited by the noise floor of the spaces where the performances are recorded (this includes a fair number of "professional" studios, BTW). This is not a matter that can be decided by some team of design engineers. However, even with the noise floor problems the "extra" bits still add a degree of fine detail that is worth having.

Beyond that, I'm not sure I can answer your questions better, as I am not sure I clearly understand what you mean by the terms you are using. I hope I have been able to clarify things a bit, though...

Sean McCoy
12-22-2005, 04:13 PM
I only know what I hear and I am certainly satisfied with the results SAWStudio gives me at 44.1k 16 bits with no dither... oh well. :D
Bob L
But the Behringers do their initial conversion at 24 bits, so aren't you dithering to 16 bits before going into Studio? Or are you just recording at 16 bits and letting the files get truncated? It's one thing to prefer 16 bits, but it's another to actually record at 24 bits and drop the last eight. At least it seems it should be!

Bob L
12-22-2005, 09:24 PM
By rights, I should use the 24 bit setting... but in the end, I don't find that much, if any, noticable difference... and these live gigs can eat up gigabytes of space... there really seems no reason for me to take on all the extra load of the 24 bits and larger files and more cpu load when in the end, I can't really hear any noticable difference... besides... the live environment is already a mess with the micing on stage and the noise and monitors and room noise and so forth...

Again... I understand all the conversations and have read so much text about it all and the fade out concerns and so forth... and in the end... I simply can not find justification for any of it...

My final CD mixes sound wonderful to my ears and to most anyone who listens... I have gotten rave reviews... you can crank up the volume and listen to my end of song reverb trails and not hear any problems... any more so and in fact less so, in many cases, than many commercial CDs done with the most expensive gear and the biggest names in the industry.

So... for me... if it ain't broke... don't fix it. :D

Bob L

studio-c
12-24-2005, 11:31 AM
No you´re not.
I Allways try to go for the first 90% before trying to fiddle with very tiny details that doesn´t do any large change.
Good idea.
When I was younger, I would agonize over details that would drag the session down. I can only guess it raised blood pressures during 20 piece Union string sessions at national broadcast residual rates. :) When I started working for a jingle company, the cigar chomping sales guy took me aside nicely and said, "They can't see that from the cheap seats." That was some of the best advice I ever got.

Real life is a balance. (Not for you, BobL. We want your stuff to be pristine so we can "abuse the crap out of it"[love that expression] and get away with it :D )

studio-c
12-24-2005, 11:47 AM
Thanks, Cary. I guess what I still don't get is what determines the degree of "mathematical compression" of the bit range as it is applied to the dynamic range? Who "decided" that 16-bit resolution would be in x dB increments, and why did they decide on x? :)

Are you referring to 4, 8, 16, 24? Excuse me if I'm misunderstanding, but if that's what you're asking, it's the amount of "possibilities" in a binary number.
It's 2 to the x power. If you go to the Windows calculator, hit 2*2*2* a bunch of times and watch the display, you'll see your favorite flavors go by:
2,4,8,16,32,64,128,256,512,1024,2048... familiar numbers if you've ever bought RAM back in the old days :)

1 bit= 0 or 1=2 choices (the lightbulb is on or it is off)
2 bit= 00 01 10 11 = off off / off on/ on off / on on = 4 choices = 2 squared (on/off combinations of two lightbulbs)
3 bit= 000 001 010 011 100 101 110 111 = 8 choices = 2 to the third power
4 bit= (all possible combinations) = 16 choices = 2 to the 4th power
8 bit= La La La = 256 = 2 to the 5th power
...
16 bit= 2 to the 16th power = 65,536 choices, or levels of audio "height" when you put them thru an A to D converter and they come out as Voltages (okay techy guys, don't bust on me, I'm keeping this simple).
24 bit= 2 to the 24th = 16 million something "heights" or stairsteps or whatever. (I hope I hit 2*2*2* the right number of times on the Windows Calculator :) )

It's really just based on the way computer chips work. Every time you add another bit ("digit" of 0,1 choice) to the "byte" or word length, you get finer slices and more options.

If that wasn't your question, I'm gonna feel really stupid. But maybe this will help someone.

Carlos Mills
12-24-2005, 12:07 PM
Hi Dave,

This is a very interesting subject to me... since I have some free time now, I studied it a bit (Fundamentals of Digital Audio, by Ken C. Pohlmann) and here is what I found out... I hope it helps you the way it helped me... (entire passages from the book were transcribed below).


If the audio doesn't dip to the bottom end (last few bits) in volume during the playback of a recording, how is it that 24 bits sounds better than 16?

From the theoretical side, a 24 bit soundfile has BOTH a wider dynamic range and a better resolution (smaller intervals in-between values).
- While a 24 bit sound file can achieve 146 dB of dynamic range, a 16 bit one can achieve a dynamic range of about 98 dB (the formula is this = 6.02.n + 1.76, where n is the number of bits).
- While a 16 bit file has 65.536 "steps", a 24 bit one has 16.777.216.
Here is an example to illustrate the numbers above : "If sheets of typing paper were stacked to a height of 22 feet, a single sheet of paper would represent one quantization level in a 16 bit system. In a 24-bit system, the stack would tower 5592 feet in height - over a mile high." So, regarding your question, recording high volumes would be something like measuring an elephant with the two piles of sheets... We can say that both piles would be similar in terms of precision...
But things are not that simply, of course... as stated above the 24 bit system has both a wider dynamic range and a better resolution. So besides being taller, the 24 bit pile would have thinner sheets (smaller quantization intervals)... this way, if one decided to measure an ant with the piles, the "24-bit system stack" would certainly do a better job...


If the answer is that the greater resolution extends the dynamic range AND increases the "grain density" of the "dynamic volume change increments" (sorry for the bad lingo), then how was it decided how many of the added bits go toward the former and to the latter?

As far as I understand, a complex mathematical formula determines the "Signal-to-Error Ratio", which ultimately tries to define the accuracy achieved by a given X-bit measurement (or recording) system. This usually applies to hardware performance. The Signal-to-Error Ratio of a digital system is closely akin, but not identical do S/N (signal-to-noise) ratio of an analog system.
So the formula tells us the maximum expressible signal amplitude to the maximum quantization error (the maximum quantization error will be an analog value that is exactly in-between two quantization values - in short, the recorded sound sucks) . Since an analog waveform has an infinite number of amplitude values, but a quantizer has a finite number of intervals, any choice of scales can never completely encode a continuous analog function. All analog values between two intervals can only be represented by the single number assigned to that interval. Thus, the quantized value is only an approximation of the actual. The magnitude of the error depends on the size of the quantization interval (more bits, smaller quantization interval sizes, less errors) AND the signal level. A very low-level signal, for example, could receive only one-bit quantization; or might not be quantize at all. In other words, as the signal level decreases, the percentage of distortion increases. The higher the resolution the better it will record low level signals.

There is no doubt that a 24-bit soundfile will better represent whatever you are recording when comparing to 16-bit ones. BUT the question remains: is this difference clearer noticeable to our ears? Is it worth the use of extra HD space and processing, specially knowing we would have to deliver our audio CD in 16-bit resolution?

Bob L
12-24-2005, 04:42 PM
Don't forget that when converted back to analog... all the missing steps in between on the 16 bit audio curve are filled in by the natural math of the conversion filter circuits and capacitors... thereby effectively producing the same results... the missing values may have no impact at all in the final 'what we listen to' thru our speakers.... of course... if the waveform has eratic very high frequency data involved then the smaller steps in the higher bit res will follow along nicer without loosing the subtleties of the rapidy varying waveform... but again... we are talking very high frequencies... above the normal listening range... I know... I know... we are constantly told how we all hear spindown effects in the audio range caused by higher harmonics present in the audio signal... that may be so... and it may not...oh well.... end results... CAN YOU HEAR A DIFFERENCE... then use what sounds best to you and make sure you purchase the rest of the equipment to support your choice.... 96k and 192k digital systems can peak the cost factor much much higher than a good 44.1 or 48k setup... in my opinion, the cost increase is nowhere linear for any possible noticable benefits in the final sound.... but that may change in the near future... who knows. :)

Bob L

TotalSonic
12-24-2005, 06:06 PM
Bob -
You make some excellent points. Really with the difference between 24bit and 16bit it is only clearly audible as you get to below around somewhere incredibly low like -70dBfs. So unless someone is into the habit of just playing sustained fades cranked to 11 on their system then this kind of thing might just be really overstated in terms of importance as an issue. Still, for critical recordings of acoustic instruments in real spaces somehow it seems that this low level detail might be one of those clues that the sound is "recorded" rather than "in the room with you." As you said - it is all really subjective and highly debatable as to its true impact unless you have an amazingly accurate monitoring environment. Being paid by people to sweat out the tiny teeny details makes it something I have to pay attention to though.

As far as the other point of high resolution sample rates: Again - it seems that it is indeed the quality of the analog components and reconstruction filters at the front & back ends of AD and DA converters that makes the biggest difference in the sound quality. I've heard some high end converters sound to my ear a little better at 44.1kHz than the same thing recorded with a mid end converter at 96kHz.

A really interesting article by Dan Lavry regarding while he feels there is no need whatsoever for a sample rate above around 60kHz, and that recording at 176.4 & 192kHz is at best a waste of hard drive space and at worst actually gives worse performance than 88.2/96 is at
http://www.lavryengineering.com/documents/Sampling_Theory.pdf

The main reason I take credence in his writings is that his products sound amazing (I've found that the Lavry Blue DAC in "crystal lock" mode truly made a big improve for my monitoring chain)

Best regards,
Steve Berson

Naturally Digital
12-24-2005, 09:26 PM
The main reason I take credence in his writings is that his products sound amazing (I've found that the Lavry Blue DAC in "crystal lock" mode truly made a big improve for my monitoring chain)I'm really hoping Santa will bring me one of those new Lavry Black DAC's... Somehow I doubt it'll happen though.;)

Cary B. Cornett
12-26-2005, 08:07 AM
There is no doubt that a 24-bit soundfile will better represent whatever you are recording when comparing to 16-bit ones. BUT the question remains: is this difference clearer noticeable to our ears? Is it worth the use of extra HD space and processing, specially knowing we would have to deliver our audio CD in 16-bit resolution?

I would tend to answer that with a qualified "yes". First of all, HD space is a LOT cheaper than it once was, and has even become far cheaper than the equivalent amount of the analog tape it "replaces". Back when I was first getting involved in recording (before the qualifier "analog" had to be used in a studio), I was told, "tape is the cheapest thing you have".

I know of two other factors that can make higher-resolution storage worthwhile.

First, there is wide industry agreement that, with proper dithering from a 24 bit source, a 16 bit recording can preserve some of the added sonic detail of the 24 bit original. This may not be "useful" for all kinds of music, but certainly in classical and some jazz recordings, and anything else with real "delicacy of detail".

Second, we never know when we may wish to apply more processing to what we thought was a "finished" product. Most, if not all, mastering engineers prefer to be give 24 bit source files. This is where "rounding errors" come in. Every calculation you run has its accuracy limited by the precision of the final result, and also by the precision limits of the preceding result. Information "missing" from the result of a prior calculation is an error, meaning a distortion, and with each added stage of "missing detail" these errors accumulate. The "first" error may not be audible, but what about the 3rd? The 12th? or even the 43rd? Keeping intermediate results to a precision of no less than 24 bits (as SAW's mix engine does) helps keep this cumulative error from becoming noticeable.

For these reasons, I think it is a good idea to keep all stored files at 24 bit resolution until the final dithering down for the CD master.

AudioAstronomer
12-26-2005, 12:11 PM
But are those errors really a bad thing?

I know all those 'errors' in tape recordings are sure popular these days. Give it a few years, people will be so used to growing up on pos cd players and mp3 players that cheap digital will 'sound right' to them, and 'good digital' will be foreign and unliked.

That's my prediction at least... history repeats itself you know :o

mghtx
12-26-2005, 01:28 PM
I know all those 'errors' in tape recordings are sure popular these days. Give it a few years, people will be so used to growing up on pos cd players and mp3 players that cheap digital will 'sound right' to them, and 'good digital' will be foreign and unliked.

It's already here. EVERY young person I meet these days has an mp3 player and of course the "pos cd player". They DO NOT know of analog LP's and they DO NOT care. If I try to talk to them of "the way it used to be" I just get a "deer in the headlights" look.

"Cheap digital" is here and it aint goin' away....it's just getting started.

olzzon
12-26-2005, 02:41 PM
But are those errors really a bad thing?

I know all those 'errors' in tape recordings are sure popular these days. Give it a few years, people will be so used to growing up on pos cd players and mp3 players that cheap digital will 'sound right' to them, and 'good digital' will be foreign and unliked.

That's my prediction at least... history repeats itself you know :o

It´s allready there in Akai MPC60 samplers, lot of people love them for their sound. I think it´s 12 bit.

Jay Q
12-27-2005, 02:49 AM
I guess what I still don't get is what determines the degree of "mathematical compression" of the bit range as it is applied to the dynamic range? Who "decided" that 16-bit resolution would be in x dB increments, and why did they decide on x? Why didn't they pick 2x? Or x/5?Dave, could you paraphrase that first sentence? And when you say "x dB increments", are you talking about quanta (as in "quantum"), i.e. the number of possible values per quantized sample? If that's the case, the quantum is determined by the bit depth, as has been pointed out, so 16 bits = 2^16 = 65,536 = the number of quanta.

If that's not what you're asking about, it occurs to me you're asking about "dB increments" as in, e.g., 16 bits * 6dB = 96dB. If that's the case, I assume (don't know for sure) that 6dB was chosen as the multiplier because each additional 6dB is defined as a doubling of power (volume) and each additional bit doubles the number of possible values (quanta).

If that's not what you're asking about, I have no clue. ;)

Jay

Cary B. Cornett
12-29-2005, 07:15 PM
... I assume (don't know for sure) that 6dB was chosen as the multiplier because each additional 6dB is defined as a doubling of power (volume) ...

A couple of minor technical nitpicks here. 3dB is a doubling of power, 6dB is a doubling of voltage or current, thus a quadrupling of power. Neither of these, however, is a doubling of volume.

As it happens, the decibel is one tenth of an earlier unit called a bel. Early research into hearing included coming up with a unit for a doubling of halving of perceived volume, which was named the bel in honor of AGB. This same research found that doubling the volume requires ten times the power.

So, double the volume is a 10 dB increase in level, which corresponds to 10 times the power.

Jay Q
12-30-2005, 05:00 AM
A couple of minor technical nitpicks here. 3dB is a doubling of power, 6dB is a doubling of voltage or current, thus a quadrupling of power. Neither of these, however, is a doubling of volume.

As it happens, the decibel is one tenth of an earlier unit called a bel. Early research into hearing included coming up with a unit for a doubling of halving of perceived volume, which was named the bel in honor of AGB. This same research found that doubling the volume requires ten times the power.

So, double the volume is a 10 dB increase in level, which corresponds to 10 times the power.Oops... once again I've slipped up with my nomenclature.

As long as we're being nit picky ;), let's clarify the difference between perceived volume and SPL. Psychoacoustically, yes, 10dB is closer to a doubling of volume (depending on frequency and other factors). However, 6dB (more like 6.021dB, actually) does correspond to a doubling of SPL. I used 6dB since I'm strictly responding to Dave's question in mathematic rather than psychoacoustic terms.

Jay

Cary B. Cornett
12-30-2005, 01:19 PM
Oops... once again I've slipped up with my nomenclature.

As long as we're being nit picky ;), let's clarify the difference between perceived volume and SPL. Psychoacoustically, yes, 10dB is closer to a doubling of volume (depending on frequency and other factors). However, 6dB (more like 6.021dB, actually) does correspond to a doubling of SPL. I used 6dB since I'm strictly responding to Dave's question in mathematic rather than psychoacoustic terms.

Jay

Ok, more on nomenclature then...

Sound Pressure Level would be the acoustic equivalent of Signal Voltage, and as such, yes, 6dB would be a doubling of voltage or of pressure. Unfortunately, many folk equate SPL with perceived loudness in a way that causes them to misunderstand its correct usage. Furthermore, if we are using "dB" as the unit of SPL, one could say that, for example, twice 30 dB spl is 60 dB spl, which is twice as many UNITS but completely irrelevant to what we actually HEAR.

Since we are in the business of creating something that people HEAR, we should frame our explanations as much as possible in terms of hearing rather than some comparatively abstract technical POV that is basically the Technician's Turf. I personally enjoy the somewhat abstract logic of "electron pushing", but I find that most musicians, to say nothing of the general public at large, do not. I consider that a Recording Engineer may be using Technical tools, but he is serving a Musical community, so it is best if his explanations come out in musical terms rather than technical terms as much as possible. If the occasional tech in the audience (like me) cannot manage the translation, then he should study a little more.

Um, if that sounds a little harsh, I don't mean it that way... I'm just lousy at diplomacy. :eek:

Pedro Itriago
12-30-2005, 02:20 PM
Now, when you both say double the power is that RMS or P.M.P.O????

and

Aren't Bel's what you hear when they call to church??? :p :D

Lovingly running for cover;
Pedro.

P.S.: Just in case, this questions were not meant to be answered but one of my many attempst into "humor"

Jay Q
12-31-2005, 04:06 AM
Since we are in the business of creating something that people HEAR, we should frame our explanations as much as possible in terms of hearing rather than some comparatively abstract technical POV that is basically the Technician's Turf.Well... funny, Cary, because it seems to me you've tread on that turf a number of times, but, in any event, I think you're not focusing on the point. Dave asked a technical, math-related question. If I had originally used the word "voltage" instead of "power", I think my response would've been entirely appropriate given the terms the question was posed in -- my response was quite brief... hardly an abstract treatise. In answering Dave, I wasn't trying to address some broad spectrum of reader for education's sake. I found his question interesting, and, in pondering it, made a connection between the 6dB bit-depth multiplier, the doubling of voltage, and the doubling of a binary value that occurs with the addition of each bit. In that instance, I didn't see the relevance of perceived loudness since the concept I was addressing was mathematic and relied entirely upon the 6dB multiplier.

Jay

Carlos Mills
12-31-2005, 06:31 AM
Hi Cary,

This is interesting...



Sound Pressure Level would be the acoustic equivalent of Signal Voltage, and as such, yes, 6dB would be a doubling of voltage or of pressure. Unfortunately, many folk equate SPL with perceived loudness in a way that causes them to misunderstand its correct usage.

"How do we perceive SPL? It turns out that a sound which is 3 dB higher in level than another is barely perceived to be louder; a sound which is 10 dB higher in level is perceived to be about twice as loud. (Loudness, by the way, is a subjective quantity, and is also greatly influenced by frequency and absolute sound level).
Does SPL have an absolute reference value, and therefore do "SPLs" have quantifiable meaning? Yes, generally 0 dB SPL is defined as the threshold of hearing (of a young, undamaged ear) in the ear's most sensitive range, between 1 kHz and 4 kHz. (...) Rather than merely relate various SPLs to various pressures, it is perhaps more meaningful to relate SPLs to common sources of sound...

THRESHOLD OF HEARING ....................... 0 dB SPL
QUIET RECORDING STUDIO .................... around 30 dB SPL
CABIN OF JET AIRCRAFT ....................... around 80 dB SPL
THRESHOLD OF PAIN .......................... above 120 dB SPL"

From: Sound Reinforcement Handbook, by Gary Davis & Ralph Jones

Having said that, I must admit that I tweak levels down to 0,5 dB... ;)

AudioAstronomer
12-31-2005, 07:23 AM
I would note that the threshold of hearing is not 0db in practice. Most people with "average" in a controlled environment can hear slightly below 0db, and above average hearing down to -10db. The negative values are assuming you take the previous standard of threshold of hearing (which really should be about 10db lower)

Cary B. Cornett
12-31-2005, 01:38 PM
Well... funny, Cary, because it seems to me you've tread on that turf a number of times,

Well, since among other things I am a tech (usta be how I made my living), that is indeed part of my turf. :)


but, in any event, I think you're not focusing on the point. Dave asked a technical, math-related question. ...In answering Dave, I wasn't trying to address some broad spectrum of reader for education's sake.

From what I can tell, the spectrum of this forum is fairly broad in that sense, and without reading and memorizing details of everyone's resume it is difficult to know who should be expected to "already understand" what. All too often I have been guilty to confusing other folk with technical jargon (and I don't think that such would be your intent, either), so experience has taught me that a lot of this stuff is easy to, er, get confused about.

Anyway, my clarifications are not intended as any kind of personal slight or accusation, and if I came across that way, I apologize.


I found his question interesting, and, in pondering it, made a connection between the 6dB bit-depth multiplier, the doubling of voltage, and the doubling of a binary value that occurs with the addition of each bit. In that instance, I didn't see the relevance of perceived loudness since the concept I was addressing was mathematic and relied entirely upon the 6dB multiplier.


I can see the logic in that thought process. OTOH, I know how long it was before I myself properly understood the correct relationship between "decibel" as a technical term and the way we actually hear things ,particularly with reference to "10 dB is twice as loud", which I only found out some years after I had been working professionally in the recording field. For some reason, a lot of texts completely fail to explain that last point. :confused:

Dave Labrecque
12-31-2005, 02:09 PM
So, like, can no one 'splain to me why it's been decided that 16 bits covers x dB? It's got to be an arbitrary assignment, right?

Pedro Itriago
12-31-2005, 02:21 PM
Because it has more to do with physiology & psychoacoustics (subjectivity, perception) than with electronics. It happens a lot when two "worlds" as in different and distinc professions share some degree of knowledge in very dark corners of one or both of them.

Now, it would be nice to create a new profession called Audio Metaphysics, which I think would explain a lot of the current musical and audio engineering "successes" :p


OTOH, I know how long it was before I myself properly understood the correct relationship between "decibel" as a technical term and the way we actually hear things ,particularly with reference to "10 dB is twice as loud", which I only found out some years after I had been working professionally in the recording field. For some reason, a lot of texts completely fail to explain that last point. :confused:

Jay Q
12-31-2005, 05:29 PM
So, like, can no one 'splain to me why it's been decided that 16 bits covers x dB? It's got to be an arbitrary assignment, right?Dave, I'm still not sure I understand your question. Can you paraphrase "covers x dB"?

Jay

Jay Q
12-31-2005, 05:40 PM
Anyway, my clarifications are not intended as any kind of personal slight or accusation, and if I came across that way, I apologize.
I didn't think you were personally attacking me, Cary. And I appreciate the correction on the power vs. voltage thing. But, frankly, the rest of the stuff you said comes off as extraneous pedantry. You come across as the Post Police when you tell people how they should frame their explanations. I think I did just fine in terms of clarifying the technical stuff; I explained quanta, and -- in an attempt to do the thing you accused me of not doing -- I "dumbed down" what voltage implied (of course, I erroneously said "power") by calling it "volume" just to give some context.

Any way, Happy New Year. :)

Jay

AudioAstronomer
12-31-2005, 05:59 PM
So, like, can no one 'splain to me why it's been decided that 16 bits covers x dB? It's got to be an arbitrary assignment, right?

Because they are both exponential in nature. 1 bit more is a doubling of available values since binary to decimal is in powers of two, every added bit gives a doubling of available values to indicate... whatever.

For every bit you have to represent the audio data, you essentially have a doubling of voltage available. A doubling of voltage being very slightly over 6db.

Nothing arcane or mystic about it. :)

Pedro Itriago
01-01-2006, 12:05 AM
I'm guessing he's trying to know who determined/decided that 16 bits would cover 96 db and not higher/lower despite the lower/higher amplitude resolution if the limit was different than 96 db.

IOW, if I guessed corretly, he still doesn't get it.

Dave Labrecque
01-01-2006, 09:32 PM
I'm guessing he's trying to know who determined/decided that 16 bits would cover 96 db and not higher/lower despite the lower/higher amplitude resolution if the limit was different than 96 db.

IOW, if I guessed corretly, he still doesn't get it.

Yes, Pedro, that's it. Thanks.

So... who? :)

Jay and Pedro... thanks for staying with me, here...

If 16 bits was arbitrarily assigned to cover 100 dB, there would be a different amplitude resolution, as you say. I guess someone's going to tell me it's not arbitrary; that there's a mathematical or physical reason that 16 bits came out to 96 dB. I just don't understand why that would necessarily be a fixed relationship, since 'quanta' don't exist, so far as I know, at this 'level' of reality. :)

More under Robert's kind reply...

Dave Labrecque
01-01-2006, 09:46 PM
Because they are both exponential in nature. 1 bit more is a doubling of available values since binary to decimal is in powers of two, every added bit gives a doubling of available values to indicate... whatever.

For every bit you have to represent the audio data, you essentially have a doubling of voltage available. A doubling of voltage being very slightly over 6db.

Nothing arcane or mystic about it. :)

Now, I'm starting to get it... thanks. It's the whole exponential thing.

But just because it's a doubling of available values, why does that necessarily translate to a doubling of available voltage? Coudn't one assign the available values to available voltage via any arbitrarily chosen scale other than 1:1 to acheive different dynamic range and, therefore, different dynamic resolution for the same 16 bits?

Thanks for playing along despite my naivete. :)

AudioAstronomer
01-01-2006, 10:06 PM
Holy crap I just wrote a freaking huge post and lost it... Let me try again a bit more concise

Anyways, sorry if this seems mind-numbingly simple.

Binary is a set number of value places consisting of 0 or 1. It is a base 2 system in that every place represents 0, or a power of 2 (depending on the place). Here is a simple example:

0101

0 x 2^0 = 0
1 x 2^1 = 2
0 x 2^2 = 0
1 x 2^3 = 8

Thereby, 0101 in binary = 10 in our daily base 10 system. For every place we add in binary, we are given a posibility of 2^(place number) more values.a 4 place binary number can represent digits from 0 to 15 (16 values). A 5 place binary number can represent from 0 to 31 (32 values).

So a 16 place (bit!) number can represent values from 0 to 65535 (65,536 values). A 15-bit number only half that.. and a 14-bit number half again!

So basically, for every bit that is added there is a doubling of available values that can be represented.

We take that over to a dac (and this is all very basic and chopped down), for ever bit available, it is able to represent a doubling in voltage (since the ceiling of the data set is doubled). And what is a doubling of voltage but... ~6db. So for every doubling of available values, you have an available doubling of voltage. Meaning 1 bit ~= 6db.

All in all when everything is considered, 16-bit yeilds slightly over 100db of theoretical resolution. 16x6 and change plus ~4 for intersample data (which depends on the rating of sampling, 4db is the average given generally)

Hope that wasn't too simple, or too complicated and that new years didnt cause any errors ;)

Cary B. Cornett
01-02-2006, 07:57 AM
But just because it's a doubling of available values, why does that necessarily translate to a doubling of available voltage? Coudn't one assign the available values to available voltage via any arbitrarily chosen scale other than 1:1 to acheive different dynamic range and, therefore, different dynamic resolution for the same 16 bits?

Picture a staircase. The number of steps on that staircase is determined by the number of bits (per sample). For each added bit, the number of steps on the staircase doubles. That's the "logarithmic" part. Now, every step is the same size as every other step. Period. No exceptions. That's the "linear" part. The value of a sample always lands EXACTLY on a step, never "between" steps. The top step represents the highest possible voltage, the bottom step represents the lowest possible voltage, and the scale of voltage, as we go from step to step, is always LINEAR, which means that we CANNOT arbitrarily assign some oddball curve to it.

Dynamic range is nothing more than the ratio of the height of a single step to the height of the entire staircase. Expressing this on a linear scale uses really large numbers, which are inconvenient, and our hearing perception is not linear anyway, so we describe dynamic range using a logarithmic unit, which both lets us use more convenient numbers AND corresponds better to the way we hear.

I just realized I need to correct myself a bit (pun not intended). There have been attempts made to get more dynamic range out of fewer bits by using non-linear coding, but in so doing we give up accuracy at the "louder" end of the scale, thus ending up with more distortion in our effort to get more dynamic range out of the same number of bits. That compromise did not become popular, because LINEAR PCM simply SOUNDS better because it represents signal values at all levels with equal accuracy.

So, yeah, you COULD apply some arbitrary scale, but we found out the hard way that, at least for music, this was NOT a good idea. Now that the cost of storage has come down so much, we don't NEED to compromise quality to save space, so we use at least 16-bit linear PCM instead of, say, 11 bits on a log amplitude scale.

Ah,... did that explanation work better for ya??? I'm not sure I have any more ways to describe it... :)

Pedro Itriago
01-02-2006, 10:05 AM
The future CEO's of the world (by this I meant the hords of people preferring/not discerning lamer audio) are disagreeing with all of us by saying that, although storage is cheap, it's too expensive to have the music for free, so less (mp3, oggwhatever) is better.


Now that the cost of storage has come down so much, we don't NEED to compromise quality to save space, so we use at least 16-bit linear PCM instead of, say, 11 bits on a log amplitude scale.

studio-c
01-02-2006, 10:24 AM
The future CEO's of the world (by this I meant the hords of people preferring/not discerning lamer audio) are disagreeing with all of us by saying that, although storage is cheap, it's too expensive to have the music for free, so less (mp3, oggwhatever) is better.
It's historically been an issue of download times. In dialup days, for you to send me your favorite song as a nice fat wav file would have taken 9 hours. That's why people have been desperately scrambling for smaller and faster.
As broadband becomes superbroadband, and storage gets ridiculously cheap (it's practically there now), we'll evolve a two-tier system.

Example: I used to put a radio spot on a 5" reel of tape. The ad execs would drive up to the studio and pick them up, then drive them back to the station. Then I started making mp3 files, and had to explain FTP to radio ad execs (sigh...). Now that it's commonplace, everyone has broadband and is putting up mp3s of various quality settings, I get restless. I'm always looking for the next big thing to set my quality apart. So I just put the full wav file up, rather than go mp3. I do it because, for the first time, my audience has the knowledge and tools to handle it. The method just naturally evolved.

And yes, I'm just ignoring what is possibly a statement that the music SHOULD be free(?). It wasn't clear. :) Actually it might be good that what people are stealing is dumbed-down versions of songs. Cuz if they want the good stuff, they'll have to pay the artists. ;)

Dave Labrecque
01-03-2006, 12:19 PM
Thanks, Robert. That filled in some gaps for me. Here's the part where I'm still foggy, though:


... for ever bit available, it is able to represent a doubling in voltage (since the ceiling of the data set is doubled).

Seems to me that for every bit available, you get a doubling of the number of available voltage values, rather than of actual voltage. So, who decides which range of values to use? Do we go with +/- 15 volts? Or +\- 20 volts? And wouldn't each voltage range correspond to a different dynamic range (and it's attendant different resolution)?

Do you understand what I'm asking? Where am I going astray? :confused:

AudioAstronomer
01-03-2006, 12:28 PM
There must be a direct relationship otherwise you lose resolution. If it's not 1:1, then there will be gaps that must be rounded... and what is rounding but a war fought constantly with dither (for some folks).

The range/scale of the signal is determined by the amplifier (which is a multiplier), not by the conversion device. It's quite possible for there to be large gaps between values after amplification, but nonetheless it is still a 1:1 ratio between the original source and the output.

Dave Labrecque
01-03-2006, 12:29 PM
OK, I think I'm getting close to a "breakthrough". ;)


Now, every step is the same size as every other step. Period. No exceptions. That's the "linear" part... we CANNOT arbitrarily assign some oddball curve to it.

So my next question would be, "But who says what the size of each step is?"


Dynamic range is nothing more than the ratio of the height of a single step to the height of the entire staircase.

I think this is the part that I need to understand. It seems that the fact that dynamic range is a ratio is the thing. Lemme think about this some more...


Ah,... did that explanation work better for ya??? I'm not sure I have any more ways to describe it... :)

So close... I can almost hear it. :)

Dave Labrecque
01-03-2006, 12:33 PM
There must be a direct relationship otherwise you lose resolution. If it's not 1:1, then there will be gaps that must be rounded... and what is rounding but a war fought constantly with dither (for some folks).

The range/scale of the signal is determined by the amplifier (which is a multiplier), not by the conversion device. It's quite possible for there to be large gaps between values after amplification, but nonetheless it is still a 1:1 ratio between the original source and the output.

Hmmmmm... gotta go think (for myself -- it's a lost art, you know)

Thanks, Robert. :)

AudioAstronomer
01-03-2006, 12:35 PM
Hmmmmm... gotta go think (for myself -- it's a lost art, you know)

Thanks, Robert. :)


Really easy, the steps are always the same, but you can multiply (amplify) every value by an equal amount to increase the scale. Of course this raises the floor as well as the ceiling!

Cary B. Cornett
01-04-2006, 08:44 AM
So, who decides which range of values to use? Do we go with +/- 15 volts? Or +\- 20 volts?

Ah, now we're getting closer to the root of your confusion, I think. Robert's comment about "floor and ceiling" moving together is a good clue here. What you are talking about is the choice of where the "ceiling" is, IOW, the clipping point. This is determined, on a case-by-case basis, by the manufacturer of the particular converter you are using. There is no agreed set standard for this limit.


And wouldn't each voltage range correspond to a different dynamic range (and it's attendant different resolution)?

No. Increasing the maximum voltage swing also increases the voltage increment of each "step" by the same ratio, meaning there is no difference in the resolution of detail as compared to the maximum peak level.

Dave Labrecque
01-04-2006, 12:17 PM
I think I'm gettin' it...

Thanks for your patience. :)

Pedro Itriago
01-04-2006, 12:24 PM
Yeap, just wait a little longer, it's in the mail :p

To make it a little bit more confusing to you, what do you think it'll happen if you have some dc offset in your a/d converter? what will happen to the scale and your higher/lower most possible analog values when they get digitized this way?


I think I'm gettin' it...

Cary B. Cornett
01-05-2006, 05:24 AM
... what do you think it'll happen if you have some dc offset in your a/d converter? what will happen to the scale and your higher/lower most possible analog values when they get digitized this way?

For *small* amounts of DC offset, dynamic range won't be much affected. If the offset gets fairly bad, the clipping at overload would be asymmetrical. The "lowest possible signal level" represented would not change at all.

Pedro Itriago
01-05-2006, 08:22 AM
Darn it! that was homework for Dave :p

Dave Labrecque
01-05-2006, 01:05 PM
Darn it! that was homework for Dave :p

I'm sorry. I thought it was a rhetorical question. Kinda. Wouldn't the lowest level be altered, too, though? Well, I guess the lowest achievable signal level wouldn't be changed in value, but it would be changed in terms of placement on a given waveform as compared to its non-offset version, if ya know what I mean. Plus, pure recorded silence would be louder than the lowest possible level. Nutty.