Audio buffering and resampling rehash

Archived bsnes development news, feature requests and bug reports. Forum is now located at http://board.byuu.org/
Locked
blargg
Regular
Posts: 327
Joined: Thu Jun 30, 2005 1:54 pm
Location: USA
Contact:

Audio buffering and resampling rehash

Post by blargg »

byuu wrote:I've already tried to use vsync. I can do that just fine, but then I get differing numbers of samples each frame. The difference is so severe that resampling each batch results in audible pitch differences. If I buffer out the differences, that just results in bigger pitch changes, just less frequently.
Sorry to rehash this, but I don't understand. OK, you get different number of samples each frame. All that's needed is a stable average, like 32040 samples per second. A buffer can even out the flow to an even 534 per frame. So you may get 32000 samples in one frame, then none for 58 frames, then 40 on the 60th frame.

The average number of samples waiting in the buffer determines latency. The number needed depends on how much the number of samples per frame varies. If you get exactly 534 samples every frame, then the buffer doesn't need to have any extra samples after each frame. If you sometimes get 500 samples for one frame and 568 samples on the next, then the buffer must have 34 extra samples on average. Since the buffer size needed depends on other factors too, it's usually best to allow it to be adjusted by the user, or automatically increase it if it's running empty too often. To increase it, you either run the emulator faster for a moment, or stop reading samples from it for a frame or two.

That covers getting a consistent number of samples every frame. Then, you want to resample to a different rate. That works independently of the above buffering. Instead of feeding the 534 samples directly to the sound driver with its rate set at 32040 Hz, you'd feed them to the resampler, which then resamples as much as possible and feeds it to the sound card.

The resampler might not be able to resample as much as requested, due to each output sample needing more than one input sample (even when resampling to a higher rate). For example, if you were resampling to 2 times the input rate using linear interpolation, you wouldn't be able to generate 10 output samples when given 5 input samples, because the last output sample would depend on the as yet unknown 6th input sample. The solution is for the resampler to generate as many output samples as it can, then save the rest of the input for next time (or tell the caller how many samples it's completely done with, so the caller can simply keep the last sample in its buffer until next time).

1-2-3-4-5 input samples
123456789 output samples

5-6-7-8-9-0 input samples for next call that supplies 5 new samples
-0123456789 output samples

So after 10 input samples, you have 19 output samples, and the last input sample saved until next time.

The key here is that the resampler's ratio is not affected by how many samples it's resampling in a given call. A flawed approach would be to calculate how many output samples could be generated from a given number of input samples, and round that to the nearest integer, then adjust the resample ratio for that call based on this rounded value. For example, using the above example, you adjust the ratio from 2 to slightly less, so that the 10th output sample doesn't need anything beyond the 5th input sample. This means that the first and last output samples match the first and last input samples, which is wrong.

It gets worse if the ratio is something like 1.5. For 5 input samples, 7.5 should be output. If you rounded that to 7, the ratio would become 1.4. If later you had 10 input samples to generate 15 output samples, the ratio would be 1.5, but you'd still have the issue with the sampling points being slightly off.

I can supply test code that does the buffering and resampling, so we can find out whether this is the problem, or something else.
byuu

Post by byuu »

If you download the bsnes thread archive, found at one of the links in this post: http://board.zsnes.com/phpBB2/viewtopic.php?t=11247

... it has about 30-40 pages worth of discussion on my attempts to implement what everyone keeps saying to do.

To be honest, I don't personally remember the details of my last dozen attempts anymore to point out the problems I've encountered. Nor do I really understand much of what you're saying.

I appreciate your effort to try and help, but explaining the process isn't helping me implement it at all, and for that I'm very sorry. I don't have the energy to go at this again right now, I'm completely burned out from my last several failed attempts. I'd rather just work on other things that I can actually accomplish.

One of those is a fixed resampler, from 32khz to Nkhz. I don't know why that discussion turned to the video sync issue again.
augnober
Rookie
Posts: 15
Joined: Fri Apr 18, 2008 5:29 am

Post by augnober »

byuu wrote:One of those is a fixed resampler, from 32khz to Nkhz. I don't know why that discussion turned to the video sync issue again.
In my understanding, the resampling would/should always be from 32040Hz to NkHz (this never changes - since the source stream should be taken as 32040Hz, and NkHz is the objective), and this is not related to whether or not the number of samples that you process at a time is fixed. But I could be confused by something specific to SNES emulation or how the emulator works.

I'm not sure I understand where variable vs fixed is an issue either. If processing a fixed number of samples at a time, then in my naive attempt to program it, I suspect I would regularly process whatever number of those batches is necessary to reduce the resampling queue below some arbitrary size (and of course with the additional constraint of not overflowing the outgoing buffer -- but perhaps your resampling queue would be purposefully small enough for this to not be an issue). I would actually consider this to be a somewhat variable approach because the number of batches processed would be variable (and I should say, could be 0 the vast majority of the time). I would be unlikely to want to actually resample sample batches of variable size, out of fear that I may at some time (perhaps even the first time - which would make me doubt my own code) use a resampling algorithm that does not behave the same when working with differently-sized batches. I'd only try the variable thing after getting the fixed-sized batch resampling working. (disclaimer: I wouldn't consider myself a competent audio programmer since I've only ever done the simplest of playing and recording code)

So.. I think I agree with starting with the fixed approach.. but since it's not really "fixed" in every respect, I figured it was worth describing in more detail.
tetsuo55
Regular
Posts: 307
Joined: Sat Mar 04, 2006 3:17 pm

Re: Audio buffering and resampling rehash

Post by tetsuo55 »

blargg wrote:snip
Sounds perfect to me
henke37
Lurker
Posts: 152
Joined: Tue Apr 10, 2007 4:30 pm
Location: Sweden
Contact:

Post by henke37 »

We need illustrations here. Bring out...
MS PAINT!
Locked