bsnes vsync development thread

Archived bsnes development news, feature requests and bug reports. Forum is now located at http://board.byuu.org/
Dullaron
Lurker
Posts: 199
Joined: Mon Mar 10, 2008 11:36 pm

Post by Dullaron »

I will test it later. I'm so sleepy right now.

What NTSC games I need to test out?
Window Vista Home Premium 32-bit / Intel Core 2 Quad Q6600 2.40Ghz / 3.00 GB RAM / Nvidia GeForce 8500 GT
DancemasterGlenn
Veteran
Posts: 637
Joined: Sat Apr 21, 2007 8:05 pm

Post by DancemasterGlenn »

FirebrandX wrote:Thanks for the advice.

Sorry that bothered you so much.
No biggie. I didn't want anyone else to flame you and be a dick about it. Thanks again for helping byuu out with testing this.
I bring the trouble.
tetsuo55
Regular
Posts: 307
Joined: Sat Mar 04, 2006 3:17 pm

Post by tetsuo55 »

I was just thinking, if the "crackle" waveform is always the or somilar, maybe bsnes could mute/clip the crackles and we wouldn't have to worry about latency settings?

Not sure if its even possible, just thinking outside of the box
Dullaron
Lurker
Posts: 199
Joined: Mon Mar 10, 2008 11:36 pm

Post by Dullaron »

Yep I getting static sounds here on this this game. Rest of the sounds are fine though.

Nice to have that reporting box that you added and it helps. Did pSX Author help you on this part? He have that on pSX as well. First time I seen this on bsnes. Please keep it showing on bsnes. :D

Whoops I forgot add this bellow.

Image
Window Vista Home Premium 32-bit / Intel Core 2 Quad Q6600 2.40Ghz / 3.00 GB RAM / Nvidia GeForce 8500 GT
FirebrandX
Trooper
Posts: 376
Joined: Tue Apr 19, 2005 11:08 pm
Location: DFW area, TX USA
Contact:

Post by FirebrandX »

Byuu, I noticed a small scrolling glitch on the bottom of the screen. Were you aware of this? Just wanted to make sure it's not a slight error in the code or something.
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

Why is Dullaron's audio update 1ms while mine says either 15 or 16ms?
I noticed a small scrolling glitch on the bottom of the screen.
Can you be more specific? Some games show garbage on the final scanline, it's probably normal.
FirebrandX
Trooper
Posts: 376
Joined: Tue Apr 19, 2005 11:08 pm
Location: DFW area, TX USA
Contact:

Post by FirebrandX »

No its not normal. Its on every game, and it wasn't there in all the other versions of bsnes. Basically it looks like one or more of the scanlines near the bottom "lag" behind during scrolling.

Also, my updates always show "1ms" as well. It doesn't matter as the important thing is looking for the underflow warnings. I've been pretty much set on my end with flawless audio. I've checked on more intensive games like CT and the audio is fine.
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

FirebrandX wrote:No its not normal. Its on every game, and it wasn't there in all the other versions of bsnes. Basically it looks like one or more of the scanlines near the bottom "lag" behind during scrolling.
Can't reproduce.
Also, my updates always show "1ms" as well. It doesn't matter as the important thing is looking for the underflow warnings. I've been pretty much set on my end with flawless audio. I've checked on more intensive games like CT and the audio is fine.
Also can't reproduce :cry: Kinda wish I had kept my Egosys Juli@, that card had badass drivers where you could set the sample latency as low as 64. Back in the day, I'd do that and get triple buffering working while everyone else complained of crackling. And they were only 200kb. Now I'm using onboard IDT HD audio, its drivers are 20mb and you can't do shit.
byuu

Post by byuu »

Let's try another approach.

http://byuu.cinnamonpirate.com/temp/vsyncdemo.zip

This is a simple Win32 application that uses Direct3D + DirectSound. It draws a large white box scrolling around the screen. Kind of like Pong but with no controllers or physics.
It's also a nice example of a minimal nall+hiro+ruby application ;)

It tries to set the biggest window your current desktop resolution will support.

From here, it generates a sine wave for audio output (think of it as a constant tone). It generates 32000hz / 60fps worth of samples per frame rendered.

It is essentially as close as possible to the model bsnes uses to output video and audio, but without the emulator itself getting in the way of timing issues.

The EXE included enabled both video and audio synchronization, so they fight for resources and audio eventually crackles. But it's a one boolean change at the bottom of sync.cpp to toggle synchronization for one or the other module.

Without video or audio, you'll get several hundred FPS (1,122 FPS here). Which is good, you don't have to have a powerful computer like you do with bsnes.

Without video sync, the left and right edges of the box will tear very badly. Without audio synchronization, the box is perfect and audio crackles really badly.

The idea is that if we can get this demo working perfectly, the same changes [i[should[/i] fix bsnes as well.

Given that the emulator is no longer involved, hopefully more people will be able to take a look at this and find out what's wrong. From there, we can add in more elements: a 32040->32000 resampler, 60.09fps video generation, etc.

Ideally, we want to avoid needing to know the vertical blanking status. This is mostly because there's not a way to get this on Linux. However, I will accept an approach that requires it, if necessary.

I've found through my own testing that it helps latency out tremendously. I'm not going to hold back the Windows port on this issue because Xorg can't implement the most trivial of functions into its userspace library.
byuu

Post by byuu »

Alright, some testing with the standalone, and I seem to have the gist of the audio problem. It seems to be related to the audio output wait loop.

Before, I was waiting for the current audio position to update to a new block, and then I'd write the audio data two blocks after. Without vsync, the only overhead was emulating the base system. So long as that ran faster than audio playback, audio was always smooth. But now I have a chance of video sync deadlocking the app for up to 17ms to contend with. For whatever unknown reason, I noticed that even with 20+ms rings for audio data, sometimes more than two of them would pass in one vsync wait. That didn't seem right, so I upped the latency to 50ms per block. Same thing. What the fuck?

I realized that time.h's clock() positively sucks on Windows, so I replaced it with a QueryPerformanceCounter() wrapper (2.048GHz frequency counter) wrapper, that divides down to nanoseconds. Much better.

I updated my function to avoid the problem when two rings worth of audio pass at a time, and it's still erroring out somewhere:

Code: Select all

//note: data.ring_size = data.latency * sizeof(uint32_t) (eg 4, eg size of one sample in bytes)
  enum { Rings = 4 };

  void output(uint16_t l_sample, uint16_t r_sample) {
    data.buffer[data.buffer_pos++] = (l_sample << 0) + (r_sample << 16);
    if(data.buffer_pos < data.latency) return;

    static unsigned ostart, oend;
    oend = hpc_ns();

    static unsigned rdring = 0, wrring = Rings - 1;

    DWORD ring_pos, pos, size;

    unsigned start = hpc_ns();
    while(true) {
      dsb_b->GetCurrentPosition(&pos, 0);
      rdring = pos / data.ring_size;

      unsigned distance = (Rings + wrring - rdring) % Rings;
//      if(distance == 0) { printf("* underflow %dns\n", oend - ostart); wrring += Rings - 1; wrring %= Rings; break; }
      if(distance != 0 && distance != 1) break;
      //Sleep(1);
    }
    unsigned end = hpc_ns();
    printf("%dns\n", end-start);

    wrring = (wrring + 1) % Rings;

    void *output;
    if(dsb_b->Lock(wrring * data.ring_size, data.ring_size,
      &output, &size, 0, 0, 0) == DS_OK) {
      memcpy(output, data.buffer, data.ring_size);
      dsb_b->Unlock(output, size, 0, 0);
    }

    data.buffer_pos = 0;

    ostart = hpc_ns();
  }
The part to be concerned about is the printing of end-start nanoseconds to the console. Every time the audio breaks up, that value is always (obviously) greater than the size of one audio ring.

Even at 50ms, I'm getting values of ~52ms and such. It seems to go up with latency length increases. So what I'm guessing is somehow the loop's logic is screwy and is sleeping longer than one full block in some cases.

To be sure it wasn't dsb_b->GetCurrentPosition() taking up the time, I logged the time of that alone. It never exceeds ~1ms. It's also obviously not the high performance counter queries. Disabling that still causes audio to crackle.

Hoping someone notices the error. We're really close, and I think figuring this out will get us there!!

EDIT 1:

Okay, it seems that the distance is slowly dropping. I set Rings to 8, and I watch it run at 7 for a while, then down to 6, 5, 4, 3, 2, crackle at 1. 1 is where it'll end up running past two buffers (1 and 0) to get back to 7.

Hmmm ...
AamirM
Regen Developer
Regen Developer
Posts: 533
Joined: Sun Feb 17, 2008 8:01 am
Contact:

Post by AamirM »

Hi,

Here is a slightly improved version (to me at least). Its only the core source and exe (without nall, ruby sources). It

1) Doesn't use vblank
2) Audio doesn't crackle
3) Maybe done on linux too (don't know much about that, I am 75% sure it can be)
4) Slight one-line tearing happens after every 5-7 seconds or so. But its very very lesser than without using vblank.
5) Can be used for PAL games as well

I noticed that fourth point always happens when using vblank and when the box reaches at the top of the screen. I didn't look at your audio updating function for any problem though. Its not probably what you want but I hope this is of any help.

stay safe,

AamirM
byuu

Post by byuu »

Success! I have bsnes running all of the ToP + Estpolis intros with zero audio buffer underflows (crackling), 125ms latency (used to be 75ms, sadly; but the wait for vsync can add to overhead of emulating one frame, which can cause an audio buffer overflow.)

It pretty much becomes the preference of the user to use a slightly too low audio skew, which will cause a slight audio crackle every few minutes, or a slightly too high one, which will duplicate one video frame every few minutes. The latter is preferable to me, personally.

Best of all, I did not have to disable the S-CPU <> S-SMP desync optimization! The only thing I wasn't able to eliminate was the need to run at a scaler that is one less than the max your display can support.

I also have some ideas in the future to try dynamically adjusting the skew and watching as the buffer distance grows and shrinks, trying to pigeon hole it between two points. If successful, that will essentially mean truly perfect video and audio.

Do note that I intend to use 100% CPU time with vsync enabled. While we technically can get away with a small sleep, Windows has the annoying habit of sleeping much more than 1ms, which causes us to miss vsync entirely. That would mean lots more duplicated frames for no real benefit.

If you thank anyone for this, thank blargg for being unbelievably patient with me on this. Pretty much everything was his idea: create the linear scaler using a fixed input to output ratio, allow both video+audio to sync/block at the same time (I thought that guaranteed that both would fail miserably), create a stand-alone executable to help debug problems without the emulator (okay, I did think of this one, but I was being lazy) ...

I'm really thankful that he's so polite and helpful to me all of the time. Especially since I'm of so little help to him. He's really a great guy.

Oh, and I fixed the topic title ;)
Here is a slightly improved version (to me at least).
Ah, a very clever idea! Block the entire emulator to try and align the video to the monitor's refresh rate! I think the only big issue with that idea is that the closer you get to being exactly right, the more you'll notice a "line crawl" where the video is tearing, and it'll slowly raise either up or down the screen.

But it is a really neat idea. This would probably be good for video APIs that are unable to sync to vblank at all. I'll definitely keep this trick in mind, thank you very much!
FirebrandX
Trooper
Posts: 376
Joined: Tue Apr 19, 2005 11:08 pm
Location: DFW area, TX USA
Contact:

Post by FirebrandX »

Congrats!! New wip coming soon by chance? :D
DancemasterGlenn
Veteran
Posts: 637
Joined: Sat Apr 21, 2007 8:05 pm

Post by DancemasterGlenn »

Congrats with the progress, byuu! Did this end up being a windows-only fix? I finished reading your last post and wasn't quite clear if it came to that.

EDIT: Even if it is wondows-only, you may want to add the option of using a short sleep in the advanced options panel. People could always use it if they want to take up less cpu resources, and if they start seeing tearing then they'll know the culprit option anyway.
Last edited by DancemasterGlenn on Sun Aug 17, 2008 5:33 pm, edited 1 time in total.
I bring the trouble.
Verdauga Greeneyes
Regular
Posts: 347
Joined: Tue Mar 07, 2006 10:32 am
Location: The Netherlands

Post by Verdauga Greeneyes »

Great job, byuu! I'm way too far behind you guys to help on this, considering I'm still working on my first directsound program :P
byuu wrote:Do note that I intend to use 100% CPU time with vsync enabled. While we technically can get away with a small sleep, Windows has the annoying habit of sleeping much more than 1ms, which causes us to miss vsync entirely. That would mean lots more duplicated frames for no real benefit.
When I was working on VSync stuff I found that using timeBeginPeriod to set user mode timing to 1ms, then sleeping for (1000 / refreshRate - 1) ms (rounded down) before starting the wait loop was enough to get me to always catch VBlank. Of course, I was running this loop in its own thread, so I don't know how it would perform on a single-core system (but then, with bsnes' system requirements..) and my test program was hardly as demanding as bsnes.
byuu

Post by byuu »

14 hours of straight programming brings you this:

Code: Select all

http://byuu.cinnamonpirate.com/temp/bsnes_v034_wip04.zip
Windows binary and source included, binary does not have ZIP+JMA support enabled, as it's a WIP release.

Yes, vsync works both on Windows and Linux. In fact, it actually seems to work better on Linux, in that it requires lower audio latencies and has no troubles at full 5x scale on my 1920x1200 monitor.

Overview of new features:

Most importantly, I've added a new menu group to the settings menu group, "Synchronize", containing "Synchronize Video" and "Synchronize Audio" checkboxes. You can have neither, one or both checked. Up to you. That made the "Uncapped" speed setting redundant, so that was removed.

Next, there's a new audio configuration panel with lots of new goodies.

Volume lets you scale audio from 10% to 200%. Note that going over 100% will obviously cause aliasing. It's a much better idea to turn up your speakers first. But who knows, it could come in handy. On one machine with OSS4, I couldn't adjust volume in Audacious, and it always bothered me that it was so much louder than bsnes, so I saw no reason to cap the volume to 100% here.

Latency lets you control the number of milliseconds between adding data to the sound buffer and it being played. Note that this is not the absolute latency. Any sound servers and resamplers will obviously add to this. It increments in steps of 5ms, because I don't want people wasting their time trying to get it absolutely perfect. 5ms is a small enough increment that no human being will notice. I also have to re-create all the buffers and/or device itself when that changes, so I want to keep it from changing too frequently. Not that there's a memory / resource leak, but just in case.

PC output frequency let's you control the master frequency for the sound card output. You can set this to 22050hz (not a good idea, loses precision, there as a last resort), 32000hz (for purists), 44100hz (for most cards), 48000hz (for higher end cards -- set as default because it's a nicer multiple of 32000 than 44100 is) and, yes, 96000hz. And I'm sure all the audiophiles will remark how much better it sounds, right?

Believe it or not, there's actually some value to higher frequencies for the vsync. Higher rates lower the rounding errors with interpolation and such, so you can use lower SNES input rates. And speaking of which ...

SNES input frequency is what the base SNES input is skewed to. The basic idea is that you want to get the value as low as possible without sound crackling. The lower it is, the less video frames duplicated, the less jerkiness of the video. The higher it is, the less likely an audio breakup is.

Once again, Linux seems to come out on top here. Because of it's non-ring buffer approach to audio, both ALSA and OpenAL can insert blank samples in a way that DirectSound simply cannot. Whatever it does to BS underflows, it works really well, because you can barely even notice it.

The default is a tad on the dangerous side. If anything, you may need to increase it.

Get the right values for everything, and you can easily play games and never notice any video tearing or audio crackling whatsoever.

Lastly, I removed the "Show Statusbar" option from the misc menu, per FitzRoy.

Oh, also note that with Linux (both for OpenGL and Xv) and Win/OpenGL, you have to toggle the vsync enable in your video driver's control panel. Pain in the ass, that. Linux/SDL and Win/GDI do not vsync. No, I'm not even going to bother trying to add that to them.

My settings:

Hardware:
nVidia 8800 GTS 320, Intel HDA audio, 24" LG @ 1920x1200x24bpp@60hz

Windows:
Direct3D, DirectSound, Latency = 120ms, PC freq = 48000hz, SNES freq = 32050hz; 4x scale always works, 5x scale misses vblank every few seconds

Linux:
OpenGL, ALSA, Latency = 60ms, PC freq = 48000hz, SNES freq = 32050hz; 4x and 5x scale always works

I'd be interested in hearing what works best for you guys. I'm especially interested in how PAL works on a monitor running at 50hz. I don't have any that can handle that resolution, nor 100hz. I don't expect scrolling to look great at 100/120hz, as I have no special handling for it.
Even if it is wondows-only, you may want to add the option of using a short sleep in the advanced options panel.
No, I really can't :P
I tried just to see what would happen, calling Sleep(1) a single time is enough to jump over the entire vblank period. In the worst case scenario, you get stuck in a loop, never hitting vblank, and the framerate drops to 1fps. Trust me, you don't want a sleep in there.

Now, I know you're thinking, "why not let the video card do the sync for you?" -- well, one, some drivers still eat up all the CPU time in their loops, and two, by polling the vblank status repeatedly, I actually get better results with 5x scale in D3D on my system. And I don't have to destroy the video device to toggle the video sync enable.
mozz
Hazed
Posts: 56
Joined: Mon Oct 10, 2005 3:12 pm
Location: Montreal, QC

Post by mozz »

Regarding windows sleeping for much more than 1 ms... You may want to comb through this thread for some ideas:
http://mollyrocket.com/forums/viewtopic ... 09818667c2

He was trying to divide CPU time between a main thread and a background thread, such that on a single-core machine, the main thread got as much CPU time as it wanted, and the background thread got whatever was left over. (Sounds easy, right?) Well, windows likes to muck around with your thread priorities to try and prevent starvation, so getting two threads to obey these simple-sounding rules turned out to be nontrivial.
King Of Chaos
Trooper
Posts: 394
Joined: Mon Feb 20, 2006 3:11 am
Location: Space

Post by King Of Chaos »

Right now I'm doing pretty good (Sync Audio only) using Direct3D, DirectSound, Latency = 90ms, PC freq = 48000hz, SNES freq = 32030hz; 2x scale.
[url=http://www.eidolons-inn.net/tiki-index.php?page=Kega]Kega Fusion Supporter[/url] | [url=http://byuu.cinnamonpirate.com/]bsnes Supporter[/url] | [url=http://aamirm.hacking-cult.org/]Regen Supporter[/url]
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

Man, this one is way better! Thanks so much for getting vsync worked out. Tear-free video is a real treat.

After doing some testing, I think it might be a good idea to simplify the new audio sliders a little with more manageable ranges and less scattered defaults. I don't understand the 100+ volume, the 10 cutoff, and allowing less than 32000 frequency for either the computer or SNES.

My suggestion:

0-100 volume (100 default)
25-150 latency (150 default)
44100,48000,96000 pc frequency (48000 default)
32000-32100 SNES frequency (32050 default)

That puts the top two slider ticks at the end and the last two in the middle, and I think it's just easier to understand and remember the defaults without a button. While your latency works at 120, I think 150 is a safer default and sufficient maximum. Allowing 0-10 volume is not a problem. If someone consciously sets their volume to 0, they're going to remember what they did, and even if they didn't they'd know to check this area. The reason I took 32000 and below out is because I could NOT get them to work crackle-free with any SNES frequency setting. Maybe it's just me, but it could be confusing to leave something in which breaks vsync regardless.

Lastly, I'd recommend renaming the menu entry to:

Emulation Sync >
Sync to Audio
Sync to Video
tetsuo55
Regular
Posts: 307
Joined: Sat Mar 04, 2006 3:17 pm

Post by tetsuo55 »

FitzRoy wrote: I don't understand the 100+ volume

My suggestion:

0-100 volume (100 default)
Byuu explained that values between 100 and 200 are sometimes needed in linux enviroments.

Also it could be handy in some cases, like weak laptop speakers
FirebrandX
Trooper
Posts: 376
Joined: Tue Apr 19, 2005 11:08 pm
Location: DFW area, TX USA
Contact:

Post by FirebrandX »

FitzRoy wrote:
My suggestion:

0-100 volume (100 default)
25-150 latency (150 default)
44100,48000,96000 pc frequency (48000 default)
32000-32100 SNES frequency (32050 default)
On the volume: I'd agree only if 100 means non-scaled max audio (meaning we're not going louder than normal output). I guess the way byuu has it now is best here.

On the latency: I think the 125 default is fine.

On the SNES frequency: Again byuu correctly has max go up to 32200. My value of 32130 works best for my system, which would be out of range on your slider suggestion.

To byuu, blargg, and others who worked on this: Thank you so much! You've made bsnes now THE choice emulator to use for SNES emulation in my opinion!
King Of Chaos
Trooper
Posts: 394
Joined: Mon Feb 20, 2006 3:11 am
Location: Space

Post by King Of Chaos »

Question, are you guys using audio sync only, or both audio and video sync?
[url=http://www.eidolons-inn.net/tiki-index.php?page=Kega]Kega Fusion Supporter[/url] | [url=http://byuu.cinnamonpirate.com/]bsnes Supporter[/url] | [url=http://aamirm.hacking-cult.org/]Regen Supporter[/url]
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

King Of Chaos wrote:Question, are you guys using audio sync only, or both audio and video sync?
Both, you pretty much have to to eliminate both tearing and crackling.

Firebrand, I was editing my post to fix that typo when you responded, your worries about 44100 were thus edited out. And yeah, if you can't get crackling below 32100, then my suggestion must change to 32000-32200, with 32100 as default. But that's confusing, are you saying this new build didn't benefit you at all? Because 32130 was what you were having to use the last time.

Maybe one day, byuu and co will figure out a way to get vsync working at a flat setting close to 32040 and this extra setting can be wiped, it's kind of confusing.
King Of Chaos
Trooper
Posts: 394
Joined: Mon Feb 20, 2006 3:11 am
Location: Space

Post by King Of Chaos »

Yeah, in my case I can only use audio sync, using both (or video alone) causes the FPS to drop down somewhat... then again, I'm stupid enough to have the Scale2x plugin on still... :roll:

EDIT: Yep, that did it. Turned off Scale2x, turned on both sync settings, and all seems well now.
[url=http://www.eidolons-inn.net/tiki-index.php?page=Kega]Kega Fusion Supporter[/url] | [url=http://byuu.cinnamonpirate.com/]bsnes Supporter[/url] | [url=http://aamirm.hacking-cult.org/]Regen Supporter[/url]
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

tetsuo55 wrote:Byuu explained that values between 100 and 200 are sometimes needed in linux enviroments.
Yeah, I didn't read his post very well. He posted the WIP in another thread and I didn't read his comments here. I still think it's pointless. Amplifying past max output to compensate for shitty speakers is a dangerous allowance. And it incurs quality hits anyhow.

He says that purists want 32000, so I guess 32000,44100,48000 would also work, as 96000 is kind of pointless. Just trying to get the tickers lined up is all. Defaults are ridiculously easy to remember when they're on the end or middle of a slider, and it looks better that way, too.
Locked