bsnes v0.037a released

Archived bsnes development news, feature requests and bug reports. Forum is now located at http://board.byuu.org/
Locked
byuu

Post by byuu »

grinvader wrote:What do you intend to do about the 2 ppus ?
A very good question. I'm not 100% sure just yet.

All of our current scanline renderers have no problems interleaving their function, so I'd like to keep both of them inside the same class. They may be two chips, but unlike the CPU and SMP/DSP ... they really cannot do anything useful alone.

I will separate all of their variables into separate blocks, and try and comment on what chip is doing what to make their separation clear.

The biggest problem I foresee is if there are hardware communication delays between the two chips. Eg let's say a write is recognized by PPU1 at time=n, and by PPU2 at time=n+4. I'd have to keep a 4-cycle history for that register.

But I think most of the communication will be internal and minimal, so it should be okay.

If it turns out I absolutely have to split them to two separate threads ... that should be doable with a similar model to this. I'd just need two ring buffers. The context switches for each PPU1/PPU2 tick would be brutal. I'd be absolutely forced to use state machines at that point.

One of my biggest fears at the moment is whether or not the BG mode register can be changed in the middle of a scanline. If not, it should be easy to streamline and cache a lot of these variables (eg have the MMIO register write functions update the temp variables inside the renderer as well.) But if so, we'll be doing a ridiculous number of register checks for every single pixel.

Geez ... can you even imagine having to switch from mode0 to mode7 right in the middle of a scanline ...?

----------

All in all, I don't expect to be able to get this cycle renderer anywhere near perfect. I'd just like the mid-scanline effects to at least be seen. It'd be very helpful for the hobbyist coders.

And if we're lucky, we'll figure out how to back-port fixes to games like Megalomania, Adv. of Dr. Franken and Uniracers to the scanline renderers that are otherwise more than adequate.
creaothceann
Seen it all
Posts: 2302
Joined: Mon Jan 03, 2005 5:04 pm
Location: Germany
Contact:

Post by creaothceann »

byuu wrote:One of my biggest fears at the moment is whether or not the BG mode register can be changed in the middle of a scanline. If not, it should be easy to streamline and cache a lot of these variables (eg have the MMIO register write functions update the temp variables inside the renderer as well.) But if so, we'll be doing a ridiculous number of register checks for every single pixel.

Geez ... can you even imagine having to switch from mode0 to mode7 right in the middle of a scanline ...?
The Atari 2600 has no problems with that kind of programming. :)

Doesn't the SNES load all the 8x8 tiles of a line before the line is drawn? If that's the case it'd be foolish to allow changes mid-scanline, but you never know until you try...
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
Gil_Hamilton
Buzzkill Gil
Posts: 4295
Joined: Wed Jan 12, 2005 7:14 pm

Post by Gil_Hamilton »

creaothceann wrote:
byuu wrote:One of my biggest fears at the moment is whether or not the BG mode register can be changed in the middle of a scanline. If not, it should be easy to streamline and cache a lot of these variables (eg have the MMIO register write functions update the temp variables inside the renderer as well.) But if so, we'll be doing a ridiculous number of register checks for every single pixel.

Geez ... can you even imagine having to switch from mode0 to mode7 right in the middle of a scanline ...?
The Atari 2600 has no problems with that kind of programming. :)

Doesn't the SNES load all the 8x8 tiles of a line before the line is drawn? If that's the case it'd be foolish to allow changes mid-scanline, but you never know until you try...
And a programmer from the era described writing for the 2600 as throwing away every good programming practice you ever learned. :P
byuu

Post by byuu »

New WIP.

The biggest news is that I've implemented what I was discussing earlier, and it worked perfectly. The S-PPU enslavement to the S-CPU is no more.

As of this point, all four processor cores, and all three of their shared relationships, run completely independently of one another.

This required moving the inline timing code from the absolute most timing-sensitive section of the emulator, to an entirely new external class. It also required logging more state data, adding ~100k/second more context switches, etc. It was unavoidable that the new approach would be slower, but I was able to greatly mitigate the speed loss. Right now, it stands at a ~6-8% speed loss from the previous release.

But there is good news:
1) aside from SuperFX / SA-1 support, which will require additional processing inside the emulator core, no other changes should slow down the emulator again. It can only get faster from here. Most importantly, a range-based IRQ tester would offer a major speedup.
2) this approach will allow both a scanline-based and cycle-based S-PPU core to work with only one S-CPU core. No need to subclass and duplicate the timing code + scheduler as I was planning to before.
3) with this change, I was finally able to convert the scanline-based S-PPU renderer to a hybrid that I've talked about with FitzRoy in the past: this allowed me to finally cache OBSEL writes at (roughly) the appropriate position, while still rendering the screen at a different point. I render the screen at H=512, and cache OBSEL at H=1152. May not be hardware accurate, but it allows Adv. of Dr. Franken + Winter Olympics + Mega Lo Mania to all work as expected, all at the same time.

It wasn't 100% exactly how I wanted to do things ... but I'm really happy about this de-coupling. I've always been a purist when it comes to implementing processor cores independently of one another, and it's always bothered me greatly the way the CPU controlled the PPU and its counters.

With the above changes, I've eliminated the four ppu.hack config settings. I don't see much of a need for them.

I've also embedded the readme and license text files. FitzRoy, I haven't had a chance to revise the readme as you were suggesting yet. Not ignoring you there, it's just low on my priority list right now.

Lastly, I took FitzRoy's advice, and removed the WAV logger entirely. I'm also going to leave the screenshot capture out. At least for now ... the UI is starting to get a bit too bloated for my tastes.

This is also the first uploaded WIP with the new debugging key-bindings (tracing and memory export.) I don't expect anyone here to have much use for them.

Anyway, testing would be appreciated. It's very likely that the OBSEL cache position needs to be tweaked further. I recall LotR or something also had issues with caching in the past ... but I couldn't find the game at ::ahem:: the used game shop ... to test it.

I think there were other games that had different behavior based on the old obsel_cache setting, too. Would be good to make sure they all work as expected.

EDIT: Ah, "JRR Tolkein's LotR", bah. Yeah, no sprite flickering with the new WIP. Also, speed hit only seems to affect Core 2's. No frame drop on my Athlon. Probably something to do with locality of reference or somesuch. Modern processors are too damned complicated :P

So then, assuming nobody spots any bugs ... how about a new release tomorrow?
Dullaron
Lurker
Posts: 199
Joined: Mon Mar 10, 2008 11:36 pm

Post by Dullaron »

Go ahead and release it if that what you want. No need to ask byuu.
Window Vista Home Premium 32-bit / Intel Core 2 Quad Q6600 2.40Ghz / 3.00 GB RAM / Nvidia GeForce 8500 GT
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

I can't remember the LoTR thing, but I do remember that Ninjawarriors had all kinds of sprite flickering issues when line caching was enabled in old versions. I've checked the new version and it looks fine.

Also, this seems to fix the scrambled horizontal lines that appeared in the Jurassic Park intro and the Firepower 2000 title screen. These two were scanline position sensitive and casualties of the default value. You could "fix" them with other values, but it would cause different games to break.

The games unaffected by scanline position were Mega lo Mania, Adventures of Dr Franken, and Winter Olympics. Line caching fixed their line issues at the expense of causing sprite flickering in Ninjawarriors and untold others (we were well into testing when the option was introduced). I'm just explaining the history for others, you know all this.

Anyway, I'm glad if this has finally fixed all that in the scanline renderer. Well worth the small speed hit.
byuu

Post by byuu »

I can't remember the LoTR thing
One of the bottom right sprites flickered quite a bit with it on.
Also, this seems to fix the scrambled horizontal lines that appeared in the Jurassic Park intro and the Firepower 2000 title screen.
Wow ... that's weird (and cool) with JP. That's a mode7 screen, so I expected it to be a scanline positioning problem. It's probably more likely that the render_position = 512 either got more or less precise with the change ... so some other stuff will probably be off a bit now.
Anyway, I'm glad if this has finally fixed all that in the scanline renderer. Well worth the small speed hit.
I'm crossing my fingers, but not very hopeful. If we have at least three games fixed just by partially correcting $2101 OBSEL timing ... then there's bound to be a ton of other titles that need other registers properly timed. There's 63 other registers that are currently not timed at all. Not to mention, OAM data is still technically pulled a full scanline too late. We're just lucky that writes to it during active display are so unreliable (with only Uniracers crazy enough to try it.)

Anyway, I'm still not going for perfection with this renderer: I don't think it's achievable. This is more like playing whack-a-mole ... the same thing we were doing with opcode-based CPU timing adjustments before.

It's good justification that we really do need a cycle-based PPU renderer for the SNES, at least. And I at least finally have the scheduling system necessary to do that. The main hurdle now will be properly assessing just how crazy that stuff gets. Most importantly, I need to verify if the BGMODE register ($2105) can be changed mid-scanline. That will drastically change how the new PPU core will need to be designed.

Many thanks for testing Ninjawarriors et al.
henke37
Lurker
Posts: 152
Joined: Tue Apr 10, 2007 4:30 pm
Location: Sweden
Contact:

Post by henke37 »

It's fun to play wack-a-bug, you get lots of nice test cases for the rewrite.
byuu

Post by byuu »

Okay, so ... polishing stuff left to do:

- sCPU::add_clocks(n) -- which should come first? scheduler.addclocks_cpu(n) or tick(n)? Probably the latter ...
- either implement PPUcounter::tick(), or explain the workaround to it not being there in use by the current renderer
- add default descriptions to path window (<default directory> + <same as game ROM path>)
- add bsnes icon to readme/license window
- tweak readme text
- add hide rules for input.debugger options in the advanced panel

I know there's lots of other minor things ... anything major that would be upsetting to not have in the next release?
It's fun to play wack-a-bug, you get lots of nice test cases for the rewrite.
Meh, I don't believe the time expenditure will be very useful.
Locked