Bug Report

**John Ludlow** · 03-23-2021, 05:46 PM

Hi Bob.

Dave was having a problem with files created in SAW being rejected by dbPowerAmp when attempting to convert to mp3, and it mentioned something about "Chunk is out of RIFF area but still inside file". So, I've been attempting to help him run it down. In the process, I've found something that may or may not be related - but is awfully peculiar.

First, let me say that I am not an assembler programmer and rarely deal in hex (all the endians and the bases...). So, it's uncomfortable for me and I have to move real slow to avoid making stupid mistakes. But, I at least don't think I have made a mistake here. Also, though, I've never worked over a wave file before. It's possible that I missed something. But, I don't think so. I think I've found a bug. Plus, it at least looks as if it has solved Dave's dbPowerAmp conversion problem.

Here's the first few bytes in Dave's file from a hex perspective, and the translation. I put it in Courier New - but it gets rid of my spaced columns anyway. So, unfortunately, it's hard to read.

52 49 46 46 98 83 03 00 57 41 56 45 66 6D 74 20
R I F F 230,296 W A V E f m t _

10 00 00 00 01 00 02 00 80 BB 00 00 00 EE 02 00
16 Bytes 1(PCM) 2 ch 48,000 s/sec 192,000 B/sec

04 00 10 00 64 61 74 61 64 83 03 00
4 B/smp 16 b/smp d a t a 230,244 bytes

Here is a table of that data that includes how many bytes each field takes up. (same problem here...)

RIFF Header
-------------------
bytes Field Content
4 Chunk ID "RIFF"
4 Chunk Size 230,296 bytes to end of file beginning at 'W' (below)

So, beginning with the word 'WAVE' to the end of the chunk (also file...) takes up 230,296 bytes. The file is altogether 230,304 bytes long according to Windows. 4 + 4 + 230,296 = 230,304. The RIFF header chunk size ties nicely.

4 Format "WAVE"

SUB CHUNK #1
-------------------
4 SUB CHUNK ID "fmt "
4 SUB CHUNK SIZE 16 bytes

Between the first byte of 'audio format' (below) and the rest of the fmt sub chunk, sub chunk size says it should take 16 bytes. And: 2+2+4+4+2+2 = 16. The fmt subchunk size ties nicely.

2 AUDIO FORMAT 1 (PCM)
2 NUMBER OF CHANNELS 2 (stereo)
4 SAMPLE RATE 48,000 samples per second
4 BYTE RATE 48,000spc * 2 channels * 2 bytes/sample = 192,000 bytes per second
2 BLOCK ALIGN 2 channels * 2 bytes/sample = 4 bytes/sample, stereo
2 BITS PER SAMPLE 16

Now the data subchunk.

SUB CHUNK #2
-------------------
4 SUB CHUNK ID "data"
4 SUB CHUNK SIZE 230,244 bytes from the next byte to the end of the data sub chunk

And so, beginning at the 45th byte, right after the word 'data' are the music samples themselves. And, we know that they should take up 230,244 bytes from the sub chunk #2 size. By extension, if 'data' is the last sub chunk, the size of the entire file must be 44 + 230,244 = 230,288 bytes. But, it isn't. It's actually 230,304: 16 bytes more - as we figured from the chunk size, above, and also as reported by Windows. The data sub chunk size doesn't tie with anything.

The RIFF PCM Soundfile Format contains the potential of additional subchunks though. Usually, those are at the beginning of the file, before the fmt subchunk (copyright, publisher, album, et. al.). But, I don't see any rule that says it can't be after the data subchunk.

So, could that be it? Is there another subchunk hidden in the last 16 bytes of the file? Let's look:

02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

According to the RIFF rules, the first four bytes would have to be the subchunk id, which is text. Here that would be ASCII ^B plus 3 nulls (not spaces (not text...)). Then the sub chunk size would be the next 4 bytes - and those are all nulls. Therefore, a subchunk with an id of ^B and three nulls, and a size of zero, plus 8 more bytes worth of nulls/zeros for content. So - I don't think so. That's gotta be the last 8, two byte, samples of the recording that are outside of the declared data sub chunk size.

I checked a wave file with an origin from someplace else (ripped from a CD) and it does not have the same issue. The data subchunk size matches the number of bytes in the data sub chunk section.

This particular file of Dave's does not have the problem with dbPowerAmp. I suspect that it is because most of the last 16 bytes are nulls. I'm thinking that dbPowerAmp is ignoring those. But, when it (randomly) gets something that looks like a sub chunk definition - it begins to attempt to operate on it, fails, and puts up that (incorrect and misleading) message (since the problem is not with the chunk, per se, but the subchunk). Two files of Dave's that failed in dbPowerAmp work correctly there when I add 16 to the data subchunk size.

I also checked four other files produced by Saw Studio - two more 64 bit and 2 32 bit. They all have the same issue: data sub chunk size declared 16 bytes shorter than the samples it represents. So - I think I've found a bug. What do you think?

**John Ludlow** · 03-23-2021, 06:10 PM

I just checked a wav I made in 2009 using SAW Studio. It does not have that problem.

**Bob L** · 03-24-2021, 06:45 AM

I'll explore into it... although this is the first reported case of a problem and there have been many thousands of wav files created in SAWStudio64... so I am not sure.

Bob L

**Dave Labrecque** · 03-24-2021, 07:16 AM

Originally Posted by Bob L

I'll explore into it... although this is the first reported case of a problem and there have been many thousands of wav files created in SAWStudio64... so I am not sure.

Bob L

Thanks, Bob. If you want me to run any tests with dBpoweramp, let me know.

**John Ludlow** · 03-24-2021, 12:34 PM

Originally Posted by Bob L

I'll explore into it... although this is the first reported case of a problem and there have been many thousands of wav files created in SAWStudio64... so I am not sure.

Bob L

I understand. I was pretty dubious at first too. It took days until I was sure. There are very few people whose software is responsible for the creation of more wave files than you over the last few decades. It seemed very unlikely. But if the data subchunk count + the count of all the bytes used until that point (which is 44 unless there is an extra subchunk before it) isn't equal to the file length, and there isn't an additional subchunk following the samples, then the file must be out of spec, right?

I'm less positive that fixing this will fix Dave's dbPowerAmp issue - although fixing the data subchunk size did cause two files of Daves, that previously blew dbPowerAmp up, to work correctly.

My theory is that almost all music programmers presume that there will be nothing following the actual samples - so they ignore the data subchunk size altogether and loop till either EOF - or else to chunk size. But, the dbPowerAmp programmer saw that the spec at least allowed trailing subchunks, even if no-one ever does that, and wrote to cover them. And ironically, that's what is breaking. But, that's just a theory. I could be wrong.

Also, it's not just SAW64 - it's at least the recent version of SAW32 too. But - not the version of SAW32 I was using in 2009, if that helps.

**Dave Labrecque** · 03-24-2021, 01:52 PM

Originally Posted by John Ludlow

Also, it's not just SAW64 - it's at least the recent version of SAW32 too. But - not the version of SAW32 I was using in 2009, if that helps.

Are you sure about this bit, John? I think in my experience the only problem WAVs were the ones coming out of SS64. Haven't had any issues with the WAVs I've made with SS32. The issue arose at the exact time I made the switch to SS64, and SS32 WAVs I've tested since have remained issue-free with dBpoweramp.

**John Ludlow** · 03-24-2021, 04:25 PM

Originally Posted by Dave Labrecque

Are you sure about this bit, John? I think in my experience the only problem WAVs were the ones coming out of SS64. Haven't had any issues with the WAVs I've made with SS32. The issue arose at the exact time I made the switch to SS64, and SS32 WAVs I've tested since have remained issue-free with dBpoweramp.

I thought I was sure, but because you questioned it, I checked your two files again. The file "test SS32 output.wav" is 16 bytes short, as I thought. But, the file "Dog Bark SS32 output.wav" is not short - so I was wrong about it. I even went back and downloaded it again. Dog Bark SS32 output.wav was recorded at 48/16. test SS32 output.wav was recorded at 44.1/24. So, there is that difference (and it may be important).

Just to thwart Murphy, I went back and checked the SAW64 files too. Dog Bark SS64.wav is recorded at 48/16 and it's off by 16 bytes. test SS64 output.wav was recorded at 48/16 and it's off by 16 bytes.

And, your file VO Demo_Mix.wav is recorded at 44.1/16 and it's off by 16 bytes. I believe that was also a SS64 file.

My file "Pony Tail Girl Third Mix_Normalized.wav" is 44.1/16 and it's off by 16 bytes. It was recorded by SS64 - although at a higher fidelity and mixed down to that.

So that makes test SS32 output.wav the outlier. Are you sure that was recorded in SS32? If so, would you mind recording a couple more in SS32 at 44.1/24, and 48/16 and sending them to me so we have a little larger data set at those sample rates and sizes for SS32? If it turns out that the results are consistent, we should probably test all combinations of sample rates and sizes for both SS32 and SS64 since those variables might be important.

**jmh** · 03-25-2021, 05:07 AM

I'm not breaking out the hex editor as the tools I use (lame, sox, oggenc & opusenc) parse saw wav without issue. Then again, I don't tend to read the gibberish when the file successfully gets parsed. I came across this:

https://en.wikipedia.org/wiki/Resour...ement_problems

...Dave's encoder could just as easily have some confusion about the format and expect a mal-placed header. And remember even though they shouldn't be, specifications are moving targets, riff and wav may have changed somewhat to formalize header positioning, address shortcomings, or accommodate formats that didn't exist back in the day.

**John Ludlow** · 03-25-2021, 08:33 AM

Originally Posted by jmh

I'm not breaking out the hex editor as the tools I use (lame, sox, oggenc & opusenc) parse saw wav without issue. Then again, I don't tend to read the gibberish when the file successfully gets parsed. I came across this:

https://en.wikipedia.org/wiki/Resour...ement_problems

...Dave's encoder could just as easily have some confusion about the format and expect a mal-placed header. And remember even though they shouldn't be, specifications are moving targets, riff and wav may have changed somewhat to formalize header positioning, address shortcomings, or accommodate formats that didn't exist back in the day.

That was an interesting read for me because it's only the second document regarding wav format that I've read. Of special interest was that, in this one, they seem to indicate that the only other potential data addition is a LIST chunk, whereas in the other document it left open the potential of additional, previously undefined, subchunks. I came across a LIST subchunk in the commercial song I ripped off a CD - and I say 'subchunk' because it was positioned within the wav chunk and used its chunk size. Their concern seemed to be that in the event that the LIST chunk was added later, it would throw off previously declared lengths in other chunks (which is true). But, none of Saw Studio's versions even use a LIST chunk (subchunk) so I don't think that's a potential concern in this case. They also mention the need for padding. All of the files I looked at padded. That's not a big deal, I don't think. They are fixed-length fields. If you don't use the entire field, you just fill the extra position with a null (00) or a space (20) depending upon the expected data type - and that has been done by Hoyle in every case. So, for instance, the format subchunk id is declared as 4 bytes but only uses 3, so it is written as 'fmt ', with a hex 20 (space) added to the end.

Still, they leave open the possibility that the spec does not allow the addition of a subchunk after the data subchunk. And, in my post mortum theory, that's what I am presuming is screwing up dbPowerAmp since the programmer was trying to account for the potential of a trailing subchunk. So maybe that's wrong. The fact that when I increased the data subchunk size to reflect the actual length, that caused two of Dave's files to work with dbPowerAmp, when they didn't before, seemed to support the theory - but it is difficult to determine motive by result. Maybe dbPowerAmp uses the data subchunk size to decide when to stop looping for some other (unknown) reason instead.

But, beyond Dave's issue, the larger one is that SS's data subchunk size, in at least some cases, is being declared to be smaller than it actually is. It may or may not have something to do with Dave's problem with dbPowerAmp, but it is an issue none the less.

**Bob L** · 03-25-2021, 06:32 PM

Windows will pad files on disk... perhaps this is all that is happening... note that disk size and filesize are reported as two different values on every disk file in properties.

I have not changed the wav file creation routines since the beginning 25 years ago. So far... I can find nothing wrong in the code.

Bob L

Bug Report

Thread: Bug Report

Thread Tools

Search Thread

Display

Bug Report

Re: Bug Report

Re: Bug Report

Re: Bug Report

Re: Bug Report

Re: Bug Report

Re: Bug Report

Re: Bug Report

Re: Bug Report

Re: Bug Report

Posting Permissions

SAWStudio

SAC

About Us

Search

Connect With Us