bsnes v0.034 released

Archived bsnes development news, feature requests and bug reports. Forum is now located at http://board.byuu.org/
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

I'd love to see a larger sample size of people's lowest working latencies. So far there is 75 (byuu), 75 (me), and 80 (firebrand). If we don't get anything above 125, I think the range needs changed to 25-125 with a standard 100 interval. It also puts 25 at the low, 75 in the middle and 125 at the end, all multiples of 25. More sensical than 20 at the low, 110 in the middle, and 200 at the end, with a 180 interval.

Diminish, your issues are not to do with the latency since you get crackling regardless. Perhaps the new cpu priority changes are conflicting with your laptop's power savings features.
Verdauga Greeneyes
Regular
Posts: 347
Joined: Tue Mar 07, 2006 10:32 am
Location: The Netherlands

Post by Verdauga Greeneyes »

I'll see if I can get some testing in later; can you recommend any games for me to try?
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

I always just play two levels worth of Super Mario World when I test.
diminish

Post by diminish »

FitzRoy wrote:Diminish, your issues are not to do with the latency since you get crackling regardless. Perhaps the new cpu priority changes are conflicting with your laptop's power savings features.
I get more crackling when latency is lower. About power saving features, I have PowerMizer set properly and I verified with CPU-Z that CPU is running to the fullest with bsnes (2000-2200 MHz, there is some 'temporal o/c' function on mobile c2d which has additional multiplier of 11x on my model, standard is ~2000 MHz). Different c2d sleep states which are configurable in RMClock shouldn't be responsible, because they really affect idle. I am not aware of any other power savings on WinXP. My hypothesis is that Realtek simply sucks :P. On Realtek AC'97 of my desktop it's waaay worse in Super Metroid on SNESGT (Venice 3000+).
Last edited by diminish on Tue Aug 19, 2008 6:29 pm, edited 1 time in total.
byuu

Post by byuu »

Neat, not sure how fixing the inverse math caused the minimum latencies to drop, but I'll take it.

FitzRoy, what SNES input adjust did you need in the new WIP? I recall you needed a really different number than the rest of us.
FirebrandX
Trooper
Posts: 376
Joined: Tue Apr 19, 2005 11:08 pm
Location: DFW area, TX USA
Contact:

Post by FirebrandX »

The best way to test is to find a game with a smooth-scrolling horizontal screen. A saved spot in Super Metroid works great, and I often also use the first stage in Super Turrican. I basically kill off any enemies in my way and then run back and forth across the level to watch for jumps

For me, latency controls "crackling" while input frequency controls "jumps"

I wouldn't recommend testing with games like Mario Kart, because its harder to tell if there was a jump. Using a side-scroller reveals jumps more noticeably.

Also byuu, my latency (and my input frequency) values are identical from the previous wip, so I did not gain any latency advantage from this version.
King Of Chaos
Trooper
Posts: 394
Joined: Mon Feb 20, 2006 3:11 am
Location: Space

Post by King Of Chaos »

I can go 100ms without any issues at all.
[url=http://www.eidolons-inn.net/tiki-index.php?page=Kega]Kega Fusion Supporter[/url] | [url=http://byuu.cinnamonpirate.com/]bsnes Supporter[/url] | [url=http://aamirm.hacking-cult.org/]Regen Supporter[/url]
Rhapsody
Rookie
Posts: 23
Joined: Wed Jul 02, 2008 3:35 pm

Post by Rhapsody »

I was maddened by the latency on 0.034 WIP 5 (I jumped straight from 0.032) until I began messing with the latency. I've currently got it set at the minimum possible (20 ms) and I still seem to have no crackling. I can up that a fair bit with no appreciable latency, so I'm pretty happy with this.
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

byuu wrote:FitzRoy, what SNES input adjust did you need in the new WIP? I recall you needed a really different number than the rest of us.
I can do -70 in the new terms. Not much different from before, I'm not sure what you're thinking of. I think it was the debate of choosing the max for default so that people wouldn't (a) get crackling and (b) then be faced with two directions and no idea which would improve their crackling. Whereas you were contending that the added dupe frames, which I could barely see, would hurt impressions more.

I will just update this post with test results as they come:

me: 75ms, -70 frequency
byuu: 70ms, -50 frequency
FirebrandX: 80ms, -130 frequency
Fes: 75ms, -170 frequency

Currently thinking that 125 is too conservative of a max/default. Changing my recommendation to 25-100ms range with 100ms as default.
Last edited by FitzRoy on Wed Aug 20, 2008 5:45 am, edited 4 times in total.
Fes
Rookie
Posts: 11
Joined: Thu Apr 10, 2008 12:19 am

Post by Fes »

Been away for a few days and I come back to find this awesome breakthrough. :)

Here's my settings using WIP 05 with mario allstars + world:

Output freq: 48000
Input skew: -168
Latency: 75

Works fantastically now, great job!

A quick hunch about lowering latency... This build still uses maximum drift, right? If so, I wonder if tightening that up a little might have an observable impact on minimum latency. Not necessarily cycle-for-cycle sync, but maybe having the sound core catch up at the end of every frame or so. As it stand now though, latency of less than a tenth of a second is already pretty good and barely noticeable, but if there's an easy change that can push it down slightly, it might help, especially for those who can't quite crack the 100ms barrier at present.

EDIT: Another idea to improve sync. It just struck me that now that there's an easily configurable input rate, and the ability to detect both underflows and duped frames, I think the makings for "perfect" sync are already in place. You could have an "adaptive sync" toggle that would wait for duped frames or underflows. Once either happens, the skew is nudged either left or right as appropriate.

The nudging could be fine tuned to be proportional to the amount of time since the last "incident" so that as it converges to an ideal value, the changes are less pronounced. The resulting input skew should then be saved as the user's current setting, so when they start again they retain the benefits of previous runs, but will still dynamically adapt if they got a new monitor or something.
Rhapsody
Rookie
Posts: 23
Joined: Wed Jul 02, 2008 3:35 pm

Post by Rhapsody »

FitzRoy wrote:rhapsody: 20ms, ?
I actually still don't know what the input frequency setting does. It defaulted to +175, I set it to 0, spent 20 minutes going sideways as the Princess in SMK, and have no sound problems to report. What am I looking for?
Fes
Rookie
Posts: 11
Joined: Thu Apr 10, 2008 12:19 am

Post by Fes »

Speaking of mario kart, it appears as if the horizon on the lower screen is off by a few scanlines now. I checked with earlier versions, and the change seems to have been between 033 and 034. Anyone else notice this/can confirm?
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

Yeah, it's from the timing changes in base emulation. There was a flickering line before, now it's mutated into something solid. Read the FAQ concerning bugs in special chip games.

On a side note, when I went back to use v034, vsync was enabled since it shares the cfg, and of course it doesn't work in that version. Kind of annoying for backtesters. Damn cfg separation, I wanna kill it.
Rhapsody wrote:I actually still don't know what the input frequency setting does. It defaulted to +175, I set it to 0, spent 20 minutes going sideways as the Princess in SMK, and have no sound problems to report. What am I looking for?
Be sure to enable both syncs in Emulation Speed in the menu. After that, please give updated results.
byuu

Post by byuu »

On a side note, when I went back to use v034, vsync was enabled since it shares the cfg, and of course it doesn't work in that version. Kind of annoying for backtesters. Damn cfg separation, I wanna kill it.
Keep your retro versions of bsnes in separate folders. Now run one, then copy docs&settings\yourname\appdata\.bsnes\bsnes.cfg to your separated bsnes folder. The emulator will now use that config file instead. Do it for both, and no issues with different versions screwing with config file settings.
Also byuu, my latency (and my input frequency) values are identical from the previous wip, so I did not gain any latency advantage from this version.
Your skew was > 32040 in the last version, and my changes inverted the meaning of that value. I'm not sure how, because the old code was basically black magic; whereas I understand how the new values are derived quite clearly. But it shouldn't be possible to get perfection with the same exact settings :/
It just struck me that now that there's an easily configurable input rate, and the ability to detect both underflows and duped frames, I think the makings for "perfect" sync are already in place.
That is correct, but there are a lot of tough issues with such a setup.

One, the CPU<>SMP drifting would have to go. 15% speed penalty. With it, the variances would keep making the skew adjust itself, thinking big things are happening. I'll try lowering btw, see how low I can get it without taking more than 1-2% speed loss.

Two, maintaining the value would be hard. What if a virus scanner / background task suddenly spikes the CPU? Vista on its own seems to incur random 1-5 second slowdowns every few minutes. It would throw off the averager big-time. And getting it perfect requires bounces in both directions for a really long time. They may not see perfection for several more seconds / minutes after a CPU spike. Averaging over more time could create similar problems, too.

Three, even if imperceptible, it really does cause pitch changes. The higher sampling rate at a fixed ratio means no signal is lost. If we start adapting it in real-time, the signal will change.

All that said, this is what I initially envisioned, so I will work on this anyway. Just not now, I'm really burned out on all this again. If sinamas and blargg found this to be difficult, I don't have much hope for me, when I could barely get standard sync working. I need to take a break, but I still have to get those mul / div logs first.
FirebrandX
Trooper
Posts: 376
Joined: Tue Apr 19, 2005 11:08 pm
Location: DFW area, TX USA
Contact:

Post by FirebrandX »

byuu wrote:

Also byuu, my latency (and my input frequency) values are identical from the previous wip, so I did not gain any latency advantage from this version.
Your skew was > 32040 in the last version, and my changes inverted the meaning of that value. I'm not sure how, because the old code was basically black magic; whereas I understand how the new values are derived quite clearly. But it shouldn't be possible to get perfection with the same exact settings :/
I know my skew was >32040 on the previous version, I was talking about the value being the same amount since all that has changed is reversing the calculation.

At any rate, the same values work best on my system. That being 80ms latency and 130 skew (positive on previous version, negative on current version). I don't know why it stayed the same values, but I've confirmed they are the "sweet spot" for my system.
byuu

Post by byuu »

Here's before and after data for dropping the maximum skew.

Code: Select all

119.5 to 118.5 / Zelda 3
 93.5 to  93.0 / Star Ocean

250k * 20m : 5000000000000 : 203196 : 264
 20k * 24m :  480000000000 :  19506 :  25
Or in English, the skew allowed the sample count to be off by up to 264 samples in either direction at any point in time. The worst case scenario would be the CPU running the full gamut ahead, and then the SMP doing the same in return as it gains control, forcing the DSP to dump 264*2 samples out immediately during the same video frame (there are roughly 533 samples in an emulated frame). Now it can only vary by 25/second in either direction, with a worst case of 50 samples in one frame. The precision increased by over an order of magnitude for a speed penalty of roughly one half of one percent. I suppose that's a decent enough trade-off. The speed hit for lowering the skew is exponential, so we shouldn't push it further.

For those thinking I should just sync the CPU and SMP upon the edge of a video frame, it's not that easy. They run until one is ahead of the other, and only the CPU's enslaved PPU can signal a frame generation. I would have to add a new flag to the SMP to allow it to run until the skew was equal to support this. And trust me, the less checks inside the raw timing control mechanisms, the better. You incur massive speed penalties even from simple boolean variables when it's code that's hit ~10-20 million times a second.

Again, note that this has absolutely nothing to do with accuracy. They sync up whenever they try and talk to each other. This is only when they're off doing their own thing. If they don't look at the other chip, they can't possibly know they aren't running at the same point in time.
byuu

Post by byuu »

This will probably be the last public WIP, so get it now if you want it.

Code: Select all

http://byuu.cinnamonpirate.com/temp/bsnes_v034_wip06.zip
I used the same "create a child window inside the output window" trick for Xv that I used for OpenGL, so Xv will now work even with a compositor enabled.

I also added Video::Synchronize support to OpenGL for Windows. My card seems to force it on regardless of my driver settings, but maybe you'll have better luck. That driver had the same issue with allocating 16MB of memory instead of 4MB (that was due to copy and pasting of code), so that's fixed too.

This version lowers the CPU<>SMP drifting by an order of magnitude. You shouldn't notice the speed hit. I can't really get any lower latency with that, though.

I also restricted the latency range to 25 - 175, with the default being in the center, 100ms. Quite conservative, given the average we see is 70-80ms. But you won't notice the difference, and this way we ensure no popping even in exceptional circumstances by default. 25ms is doable without video sync and with OSS4+cooked mode, but I seriously doubt any Windows user will get lower without something crazy going on with the sound card drivers.

Lastly, I've replaced the 2-tap linear resampler with a 4-tap hermite resampler. You won't be able to tell the difference, but it's quite pronounced if you use a waveform analyzer on much higher output frequencies:

Linear:
Image

Hermite:
Image

Hermite is essentially better than cubic (for which cubic spline is an optimized version of), as it is better at not going too far away from the points, so you get a bit less clamping in the extreme cases. But the difference isn't audible to humans anyway. It's still clearly inferior to band-limited interpolation, as it will still have noticeable aliasing of things like square waves and such, but it's orders of magnitude less complex to implement.

Keep in mind that nobody could tell the difference even with linear interpolation from the last few WIPs.

----------

Aside from that, I'm pretty much ready to release a new version. If anyone has any show stoppers, now is the time to say something. Otherwise I'll probably post something tomorrow or Friday.
creaothceann
Seen it all
Posts: 2302
Joined: Mon Jan 03, 2005 5:04 pm
Location: Germany
Contact:

Post by creaothceann »

Not related to the next release or the near future, but how about these features...

1. AVI support? SNES9x' code seems to be ok, and it'd help with bug reports.

2. A filter that takes the current frame, combines it with the last frame (with adjustable ratio) and displays the result? Would help with flickering effects (player gets hit etc.)

EDIT:

3. Hiding the mouse cursor when it's in the window / the emulation area?
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
byuu

Post by byuu »

1. AVI support? SNES9x' code seems to be ok, and it'd help with bug reports.
Something akin to ZSNES and mencoder hooking would be doable. I wouldn't implement anything that is platform-specific, eg DirectShow or somesuch, to record AVIs. And writing my own AVI compressor, yeah, not happening.
2. A filter that takes the current frame, combines it with the last frame (with adjustable ratio) and displays the result? Would help with flickering effects (player gets hit etc.)
Dynamic frame selection added to the new version takes care of flickering already. Funny story though, while adding vsync to D3D, I tried to make it perform the Begin+EndScene calls at refresh time, and only call Present when drawing. This created a really neat effect where it'd show one frame, then a frame two before it, then two ahead, then back to normal.

Too much hassle to make it an emulator option, but if I did, I'd call it the Michael J Fox filter.
3. Hiding the mouse cursor when it's in the window / the emulation area?
In the future, sure.
I.S.T.
Zealot
Posts: 1325
Joined: Tue Nov 27, 2007 7:03 am

Post by I.S.T. »

byuu wrote:Too much hassle to make it an emulator option, but if I did, I'd call it the Michael J Fox filter.
ROFL
Verdauga Greeneyes
Regular
Posts: 347
Joined: Tue Mar 07, 2006 10:32 am
Location: The Netherlands

Post by Verdauga Greeneyes »

creaothceann wrote:2. A filter that takes the current frame, combines it with the last frame (with adjustable ratio) and displays the result? Would help with flickering effects (player gets hit etc.)
This can be done using shaders. If I get the chance I'll see if I can whip up an example of this (if I remember how it works).. but damn, I have a lot of stuff piling up (holidays make me lazy as crap).
creaothceann
Seen it all
Posts: 2302
Joined: Mon Jan 03, 2005 5:04 pm
Location: Germany
Contact:

Post by creaothceann »

byuu wrote:Dynamic frame selection added to the new version takes care of flickering already.
I had something in mind that emulates the afterglow (w?) of the TV's phosphor layer.
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
tetsuo55
Regular
Posts: 307
Joined: Sat Mar 04, 2006 3:17 pm

Post by tetsuo55 »

creaothceann wrote:
byuu wrote:Dynamic frame selection added to the new version takes care of flickering already.
I had something in mind that emulates the afterglow (w?) of the TV's phosphor layer.
On that note, would it not be easy to implement my TV-Simulation theory through shaders?
Dullaron
Lurker
Posts: 199
Joined: Mon Mar 10, 2008 11:36 pm

Post by Dullaron »

byuu wrote:Something akin to ZSNES and mencoder hooking would be doable. I wouldn't implement anything that is platform-specific, eg DirectShow or somesuch, to record AVIs. And writing my own AVI compressor, yeah, not happening.
Why not let Zsnes team do it for you? They done a good job putting one on Zsnes.
Window Vista Home Premium 32-bit / Intel Core 2 Quad Q6600 2.40Ghz / 3.00 GB RAM / Nvidia GeForce 8500 GT
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

Inconsistency:
-Video Driver, Audio Driver, Input Driver = Video driver, Audio driver, Input driver

Audio section:

-add 192000. Might as well if 96000 gets in there, and you will never get any frequency requests again

-"PC" and "SNES" are not necessary additions, they're both technically happening on the PC.

-results show that nobody has come close to needing a positive Input frequency adjust. Why not make the range -200 to 0?

-put a note below the last slider that says this:

Code: Select all

Note: When emulation speed is synced to both audio and video, a lower input frequency decreases the chance of audio crackle, but increases the chance of duplicate frames.
Lastly, let's create a hypothetical scenario using current defaults. A user turns on sync to video and gets audio crackle. We know 100ms for latency won't be the cause of that, it will likely be the -50 input setting. He doesn't know that. Since latency is capable of being set higher and is a more familiar term, any rational person, even myself, would first assume that perhaps latency at 100ms is too aggressive and test it higher. And higher. And higher. This is all a waste of time, and not something likely to get reported even though we know it's probably going to happen. Meanwhile, present indications are that allowing 100-175ms helps perhaps no one. Of the costs and benefits incurred, the costs win out. The note helps it, but it doesn't eliminate it. And no, this doesn't mean bringing latency into the note is a better solution. This one unfamiliar variable responsible for all default crackling should be the only thing they're overtly directed towards.

This will be my last explanation of the matter.

EDIT: clarified note
Locked