bsnes v0.038 released

Archived bsnes development news, feature requests and bug reports. Forum is now located at http://board.byuu.org/
Locked
byuu

bsnes v0.038 released

Post by byuu »

byuu.org wrote:2008-12-15 - bsnes v0.038 released

The main change for this release is what I talked about in my last post. Because of this, I can finally time exact cycle positions for writes to take effect within the S-PPU core. So far, I've only added OAM reset at Y=240,H=10 and OBSEL fetching at H=1152. The latter hasn't been verified on hardware, but it does fix the single black scanlines evident in the intros to Mega Lo Mania and Winter Olympics. Previously, I had a setting named ppu.hack.obj_cache ... this essentially cached OBSEL at H=512, and without it, at H=512+1364. The setting was needed to fix these games, but would then cause sprite flickering in other games, such as Ninja Warriors and Lord of the Rings. With the new timing, all of these games work correctly with the same timing.

I should note, this is still a scanline-based renderer at its core. I'm not aiming for, nor do I believe it is possible to, obtain 100% perfect rendering compatibility with this approach. But I will continue to expand its cycle-level capabilities, as it will no doubt be much faster than a true, fully cycle-based S-PPU renderer.

All of that said, the extra state logging, decoupling of timing code in the most critical section of the emulator, etc means that a small speed hit was inevitable. I mitigated it as much as I could, but it appears that Core processors suffer a ~6% speed hit from the previous version. Oddly enough, AMD processors seem to be largely unaffected by the change.

I know these speed hits continue to stack, but that's the nature of the beast. I've added a link to tukuyomi's SNES emulator archive. If you're not able to get full speed, I'd strongly recommend using an older version of the emulator. v017 in particular is nearly twice as fast as the current version, while still being very close to bug-free.

On the bright side, the new synchronization model is 100% compatible between both the scanline renderer and a future cycle renderer. That will allow me to avoid a lot of timing code duplication, and it will also allow me to continue to offer a scanline renderer in future builds. And once I get the cycle renderer perfected, I would like to team up with some other people and work on a fast and accurate emulator.

Lastly, I'm changing up how the emulator is distributed. The readme and license files are now embedded inside the executabe, accessible from the help menu. As it is no longer required to include these text files, I can distribute the executable itself directly, ala uTorrent. For sites that mirror bsnes, but do not want to host it as a direct EXE file, feel free to put it inside a ZIP archive (along with a language locale file, if you wish.)

</wall of text>

Changelog:
* eliminated S-DD1 DMA enslavement to the S-CPU; this allows the S-DD1 to behave more like the real chip, and it also simplifies the S-CPU DMA module
* eliminated S-PPU enslavement to the S-CPU; all processor cores now run independently of each other
* added cycle-level S-PPU timing for OAM address reset and OBSEL; fixes scanline glitches in Mega Lo Mania and Winter Olympics
* removed ppu.hack.* settings; as they are no longer needed due to above changes
* corrected VRAM tiledata cache bug; fixes Super Buster Bros v1.0 reset glitch
* added memory export and trace logging key bindings to user interface
* removed WAV logging (to trim the emulation core)
* embedded readme and license texts inside executable
* simplified S-CPU, S-SMP flag register handling
* source code cleanup for S-CPU timing module
* GUI-Linux: added style improvements to the listbox and combo box controls
* GUI-Linux: finally added filetype filter support to the file open dialog
* GUI-all: shrunk configuration panel [FitzRoy]
* GUI-all: modified paths panel descriptions for clarity [FitzRoy]
Regressions, bug reports, opinions on the new direct-executable distribution method, etc welcome.

Also, please be mindful of the rule[s] before commenting.
Snark
Trooper
Posts: 376
Joined: Tue Oct 31, 2006 7:17 pm

Re: bsnes v0.038 released

Post by Snark »

byuu wrote:
byuu.org wrote:2008-12-15 - bsnes v0.038 released

The main change for this release is what I talked about in my last post. Because of this, I can finally time exact cycle positions for writes to take effect within the S-PPU core. So far, I've only added OAM reset at Y=240,H=10 and OBSEL fetching at H=1152. The latter hasn't been verified on hardware, but it does fix the single black scanlines evident in the intros to Mega Lo Mania and Winter Olympics. Previously, I had a setting named ppu.hack.obj_cache ... this essentially cached OBSEL at H=512, and without it, at H=512+1364. The setting was needed to fix these games, but would then cause sprite flickering in other games, such as Ninja Warriors and Lord of the Rings. With the new timing, all of these games work correctly with the same timing.

I should note, this is still a scanline-based renderer at its core. I'm not aiming for, nor do I believe it is possible to, obtain 100% perfect rendering compatibility with this approach. But I will continue to expand its cycle-level capabilities, as it will no doubt be much faster than a true, fully cycle-based S-PPU renderer.

All of that said, the extra state logging, decoupling of timing code in the most critical section of the emulator, etc means that a small speed hit was inevitable. I mitigated it as much as I could, but it appears that Core processors suffer a ~6% speed hit from the previous version. Oddly enough, AMD processors seem to be largely unaffected by the change.

I know these speed hits continue to stack, but that's the nature of the beast. I've added a link to tukuyomi's SNES emulator archive. If you're not able to get full speed, I'd strongly recommend using an older version of the emulator. v017 in particular is nearly twice as fast as the current version, while still being very close to bug-free.

On the bright side, the new synchronization model is 100% compatible between both the scanline renderer and a future cycle renderer. That will allow me to avoid a lot of timing code duplication, and it will also allow me to continue to offer a scanline renderer in future builds. And once I get the cycle renderer perfected, I would like to team up with some other people and work on a fast and accurate emulator.

Lastly, I'm changing up how the emulator is distributed. The readme and license files are now embedded inside the executabe, accessible from the help menu. As it is no longer required to include these text files, I can distribute the executable itself directly, ala uTorrent. For sites that mirror bsnes, but do not want to host it as a direct EXE file, feel free to put it inside a ZIP archive (along with a language locale file, if you wish.)

</wall of text>

Changelog:
* eliminated S-DD1 DMA enslavement to the S-CPU; this allows the S-DD1 to behave more like the real chip, and it also simplifies the S-CPU DMA module
* eliminated S-PPU enslavement to the S-CPU; all processor cores now run independently of each other
* added cycle-level S-PPU timing for OAM address reset and OBSEL; fixes scanline glitches in Mega Lo Mania and Winter Olympics
* removed ppu.hack.* settings; as they are no longer needed due to above changes
* corrected VRAM tiledata cache bug; fixes Super Buster Bros v1.0 reset glitch
* added memory export and trace logging key bindings to user interface
* removed WAV logging (to trim the emulation core)
* embedded readme and license texts inside executable
* simplified S-CPU, S-SMP flag register handling
* source code cleanup for S-CPU timing module
* GUI-Linux: added style improvements to the listbox and combo box controls
* GUI-Linux: finally added filetype filter support to the file open dialog
* GUI-all: shrunk configuration panel [FitzRoy]
* GUI-all: modified paths panel descriptions for clarity [FitzRoy]
Regressions, bug reports, opinions on the new direct-executable distribution method, etc welcome.

Also, please be mindful of the rule[s] before commenting.
Looking mighty promising for the future renderer
I want to fry~~ Sky Hiiiiiiiiigh~
Let's go-o-o-O~ togeda~
byuu

Post by byuu »

So, I noticed this comment from v037 shortly before release:

Code: Select all

  //note: this should actually occur at V=225,HC=10.
  //this is a limitation of the scanline-based renderer.
  //... OAM reset stuff
Easy enough now. Got that in right before release. For those curious how the new PPU scheduling works:

Code: Select all

void bPPU::enter() {
  loop:
  //H =    0 (initialize)
  scanline();
  if(ppucounter.ppuvcounter() == 0) frame();
  add_clocks(10);

  //H =   10 (OAM address reset)
  if(ppucounter.ppuvcounter() == (!overscan() ? 225 : 240)) {
    if(regs.display_disabled == false) {
      regs.oam_addr = regs.oam_baseaddr << 1;
      regs.oam_firstsprite = (regs.oam_priority == false) ? 0 : (regs.oam_addr >> 2) & 127;
    }
  }
  add_clocks(502);

  //H =  512 (render)
  render_scanline();
  add_clocks(640);

  //H = 1152 (cache OBSEL)
  cache.oam_basesize   = regs.oam_basesize;
  cache.oam_nameselect = regs.oam_nameselect;
  cache.oam_tdaddr     = regs.oam_tdaddr;
  add_clocks(ppucounter.ppulineclocks() - 1152);  //seek to start of next scanline

  goto loop;
}
Bla bla "goto is evil", whatever. Replace it with while(true) {} if it lets you sleep better.

Also, here is the big timing change that causes the large speed hit:

Code: Select all

  alwaysinline void tick() {
    history.ppudiff += 2;  //this is new
    status.hcounter += 2;

    if(status.hcounter >= 1360 && status.hcounter == lineclocks()) {
      //this part is hit one in 680 calls, so optimizing it won't help much
      status.hcounter = 0;
      status.vcounter++;
      if((region() == 0 && interlace() == false && status.vcounter == 262)
      || (region() == 0 && interlace() == true  && status.vcounter == 263)
      || (region() == 0 && interlace() == true  && status.vcounter == 262 && status.field == 1)
      || (region() == 1 && interlace() == false && status.vcounter == 312)
      || (region() == 1 && interlace() == true  && status.vcounter == 313)
      || (region() == 1 && interlace() == true  && status.vcounter == 312 && status.field == 1)
      ) {
        status.vcounter = 0;
        status.field = !status.field;
      }

      scanline();
    }

    history.index = (history.index + 1) & 2047;
    history.field   [history.index] = status.field;  //this is new
    history.vcounter[history.index] = status.vcounter;
    history.hcounter[history.index] = status.hcounter;
  }
Lines not marked "new" were in the old code.

May be some minor optimizations possible ... but such simplistic code really shouldn't be affecting speed much at all. Called exactly 10.5 million times a second, and those two lines eat up the 6-10% of time lost from the last release.
Panzer88
Inmate
Posts: 1485
Joined: Thu Jan 11, 2007 4:28 am
Location: Salem, Oregon
Contact:

Post by Panzer88 »

it's great to see such progress towards the cycle based renderer and I didn't even notice the speed hit on my system. Congrats for another release byuu, and it seems like this is a breakthrough of sorts to allow some further tweaks in the weeks to follow. Great stuff.
[quote="byuu"]Seriously, what kind of asshole makes an old-school 2D emulator that requires a Core 2 to get full speed? [i]>:([/i] [/quote]
creaothceann
Seen it all
Posts: 2302
Joined: Mon Jan 03, 2005 5:04 pm
Location: Germany
Contact:

Post by creaothceann »

Question: Did you ever find a game/module that uses the "FirstSprite+Y priority" feature?

Nice release btw. :P I'll post a localized locale.cfg later.
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
ShadowFX
Regular
Posts: 265
Joined: Thu Jul 29, 2004 8:55 am
Location: The Netherlands

Post by ShadowFX »

A bug in the (probably) the GUI prevents me from using the diaeresis (e+¨).

"Export data:" = "Geëxporteerde gegevens:"

Just a heads up before I'm posting the updated locale.cfg for Dutch.
[i]"Change is inevitable; progress is optional"[/i]
creaothceann
Seen it all
Posts: 2302
Joined: Mon Jan 03, 2005 5:04 pm
Location: Germany
Contact:

Post by creaothceann »

http://rapidshare.com/files/173544573/b ... locale.rar

Some other things:
- Long key names are cut off by the joypad image (pic). How about doubling the space for the text, and centering it there vertically?
- The notes in the advanced section of the settings window are not translated. Add them to locale.cfg?
- Real boolean values ("bool") in the advanced section? This would allow the user to toggle them with a doubleclick onto the list item.
- Disable the "Set" button in the advanced section if the value in the edit control is equal to the currently set value? (This is only a cosmetic issue.)
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
Verdauga Greeneyes
Regular
Posts: 347
Joined: Tue Mar 07, 2006 10:32 am
Location: The Netherlands

Post by Verdauga Greeneyes »

By the way, now that bsnes includes the readme in the executable, is there a call for this to be translated as well?
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

Verdauga Greeneyes wrote:By the way, now that bsnes includes the readme in the executable, is there a call for this to be translated as well?
That might be asking too much of translators to do the readme and license and advanced text. But at least this text can be easily copied and pasted.
DataPath
Lurker
Posts: 128
Joined: Wed Jul 28, 2004 1:35 am
Contact:

Post by DataPath »

The readme ought to be completely reasonable to localize.

The license, for some very good reasons, probably should not be localized.
byuu

Post by byuu »

I didn't even notice the speed hit on my system
And this is why I hate modern processors so much :P
Pentium IV 1.7GHz takes a 25% speed hit from the last version (51->41)
E8400 takes a 7% speed hit (152->142)
E4600 takes a 4% speed hit (118->113)
Athlon 3500+ takes no speed hit at all

That kind of difference for two new addition statements ... it's very annoying.
it seems like this is a breakthrough of sorts to allow some further tweaks in the weeks to follow
Easily the biggest since the S-DSP by blargg :D
Someone recently mentioned he couldn't understand bsnes' scheduling system ... it's getting hard even for me to follow. Really cool the way it all comes together and works as expected.
A bug in the (probably) the GUI prevents me from using the diaeresis (e+¨).
I was able to get this to work. Make sure the file format is UTF-8, as diaeresis letters are > U+007F.
Question: Did you ever find a game/module that uses the "FirstSprite+Y priority" feature?
Nope, I've never seen a game use it, and anomie's description was too vague. Hoping it will be supported transparently when re-writing the PPU.
Some other things:
Thanks for the suggestions. I can add most of them.
I'd recommend shortening the joypad description for the time being, at least.
That might be asking too much of translators to do the readme and license and advanced text. But at least this text can be easily copied and pasted.
Yeah, I didn't want to bug the translators with giant walls of text. Really, nobody should even need to use the advanced panel, and the readme is too big (plus we're still planning to re-do it or whatever.)

I'm sure it's already annoying getting blank locale files for each new version. Hopefully people are mostly just adding the missing strings to v037a's locale, rather than starting over each time :/
ShadowFX
Regular
Posts: 265
Joined: Thu Jul 29, 2004 8:55 am
Location: The Netherlands

Post by ShadowFX »

byuu wrote:
A bug in the (probably) the GUI prevents me from using the diaeresis (e+¨).
I was able to get this to work. Make sure the file format is UTF-8, as diaeresis letters are > U+007F.
Thank you, it is working properly now.
[i]"Change is inevitable; progress is optional"[/i]
Verdauga Greeneyes
Regular
Posts: 347
Joined: Tue Mar 07, 2006 10:32 am
Location: The Netherlands

Post by Verdauga Greeneyes »

byuu wrote:Hopefully people are mostly just adding the missing strings to v037a's locale, rather than starting over each time :/
You could consider making a diff for the translators. But as long as new strings can be added to the end of the locale file it shouldn't matter much.

Regarding the readme, it should be easier to translate than the shorter strings in many ways even if it is longer, but I agree we should at least wait until we're happy with the English version.
creaothceann
Seen it all
Posts: 2302
Joined: Mon Jan 03, 2005 5:04 pm
Location: Germany
Contact:

Post by creaothceann »

byuu wrote:I'd recommend shortening the joypad description for the time being, at least.
Done. :D

http://rapidshare.com/files/173687599/b ... le__v2.rar
(result)

EDIT: fixed download
byuu wrote:I'm sure it's already annoying getting blank locale files for each new version. Hopefully people are mostly just adding the missing strings to v037a's locale, rather than starting over each time :/
I just open the previous locale.cfg and the new one in Notepad++ and switch between them. Copying lines or even entire sections is easy.
Last edited by creaothceann on Tue Dec 16, 2008 2:06 am, edited 1 time in total.
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
ShadowFX
Regular
Posts: 265
Joined: Thu Jul 29, 2004 8:55 am
Location: The Netherlands

Post by ShadowFX »

Dutch translation for v0.038 ready:

DOWNLOAD!
[i]"Change is inevitable; progress is optional"[/i]
FitzRoy
Veteran
Posts: 861
Joined: Wed Aug 04, 2004 5:43 pm
Location: Sloop

Post by FitzRoy »

byuu wrote: Pentium IV 1.7GHz takes a 25% speed hit from the last version (51->41)
I wouldn't worry about ancient processor classes that couldn't get full speed to begin with. In two years, even the cheapest netbook will probably get fullspeed. That tells me that a massive writedown (beyond the IRQ trick) for PC architecture probably wouldn't be worth the trouble. Popular ultra-portables like the PSP are where a higher calibur SNES emulator is really needed. I'm not really sure when the next iteration of handhelds is.
byuu

Post by byuu »

Anyone want to help come up with some better names for the new PPUcounter class?

Code: Select all

class PPUcounter {
  //I like these two names ... they convey that S-CPU runs ahead of S-PPU
  void tick();  //advance S-CPU
  void tock();  //advance S-PPU

  bool field();  //S-CPU current field value (0 = even, 1 = odd)
  uint16 vcounter();  //S-CPU current vertical counter
  uint16 hcounter();  //S-CPU current horizontal counter
  uint16 hdot();  //S-CPU current horizontal dot (pixel) position
  uint16 lineclocks();  //S-CPU number of clocks on this scanline

  bool field(unsigned n);  //S-CPU field value 'n' clocks ago [history buffer]
  uint16 vcounter(unsigned n);
  uint16 hcounter(unsigned n);

  bool ppufield();  //S-PPU current field value
  uint16 ppuvcounter();
  uint16 ppuhcounter();
  uint16 ppulineclocks();
} ppucounter;
Because yeah, ppucounter.ppuvcounter() looks like ass :(

A shame, I could easily do something like this:

Code: Select all

uint16 PPUcounter::vcounter(unsigned n = 0) {
  return co_active() == thread_cpu ? cpu_vcounter(n) : ppu_vcounter(n);
}
Eg automatically detect the active thread and dynamically change what vcounter() returns transparently. Kind of like thread local storage for cooperative threads.

But the code is so sensitive that a simple comparison would cause a speed hit.

Anyway ... need something minimalist, clean, and with the least amount of repetition. The counters are part of the PPU, and I can technically subclass this inside the main PPU class, so that ppucounter.* becomes ppu.*.
Popular ultra-portables like the PSP are where a higher calibur SNES emulator is really needed. I'm not really sure when the next iteration of handhelds is.
Yeah, we really don't have a sweet spot between portability, compatibility and speed right now. 9x uses an older opcode-based model, ZSNES uses x86 assembler and SNESGT is closed source.

I'd really like to team up with some other people (maybe AamirM? ;) and work on something like that once I finish the cycle renderer. Would be fun working in a group via SVN or something.
Last edited by byuu on Mon Dec 15, 2008 8:12 pm, edited 1 time in total.
grinvader
ZSNES Shake Shake Prinny
Posts: 5632
Joined: Wed Jul 28, 2004 4:15 pm
Location: PAL50, dood !

Post by grinvader »

byuu wrote:Bla bla "goto is evil", whatever. Replace it with while(true) {} if it lets you sleep better.
I don't understand why you don't use it, really. Syntactically identical, takes less space to boot, and not using goto: win-win-win. ^^
Goto is not evil, since it's only a way to write a jump.
It usually indicates either a really special event where it's the only right thing to do, or the classic case that gave it a bad reputation (silly code). Typically you can write code that's 'aware' of the inavoidable jump and end up removing it altogether, so eventually gotos became synonym with lazy/bad code.
And there's no shortage of lazy/bad coders, so the picture stuck. Every once in a while a skilled coder will use it, but that's not enough to bring it back from shameland where the morons made it drift.

Code: Select all

big OR block
Interesting. Probably can condense it a fair bit with some effort. 1/600 is still a lot with millions of calls per seconds, nay ?
皆黙って俺について来い!!

Code: Select all

<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
Pantheon: Gideon Zhi | CaitSith2 | Nach | kode54
byuu

Post by byuu »

I don't understand why you don't use it, really. Syntactically identical, takes less space to boot, and not using goto: win-win-win. ^^
The extra indentation in the main loop for absolutely no reason is just annoying, really. Especially in bigger modules like the S-DSP.

Used to have it where the cothread system would automatically re-enter the thread entry point upon return from it (rather than showing undefined behavior as now.) But yeah, that would be even more confusing to outsiders looking at the code.
Interesting. Probably can condense it a fair bit with some effort. 1/600 is still a lot with millions of calls per seconds, nay ?
An extra two or three compares ~1,600x a second. But if it helps, I'll be happy to optimize it.

Code: Select all

if((region() == 0 && interlace() == false && status.vcounter == 262)
|| (region() == 0 && interlace() == true  && status.vcounter == 263)
|| (region() == 0 && interlace() == true  && status.vcounter == 262 && status.field == 1)
|| (region() == 1 && interlace() == false && status.vcounter == 312)
|| (region() == 1 && interlace() == true  && status.vcounter == 313)
|| (region() == 1 && interlace() == true  && status.vcounter == 312 && status.field == 1)
)

vs:

if(status.vcounter == (313 - !region() * 50) - (!interlace() | (interlace() & status.field)))
Blech, absolutely evil x.x

I'd have to use more grouping layers in the if statement. It'd be harder to read, but should be doable. I'd honestly hope the compiler would optimize the redundant checks out on its own, but it probably doesn't.

I really want to understand why a simple math op, even 10 million times a second, is causing such a tremendous speed loss. You wouldn't think that'd cause a modern processor to even blink.

Seriously, this is half of bsnes' speed problem in 2kb of code. I can accept that I'm a bad programmer, sure. If someone can find a way to get it running faster, I'll happily merge their changes. I'm at a loss.
Last edited by byuu on Mon Dec 15, 2008 8:53 pm, edited 1 time in total.
creaothceann
Seen it all
Posts: 2302
Joined: Mon Jan 03, 2005 5:04 pm
Location: Germany
Contact:

Post by creaothceann »

byuu wrote:The extra indentation in the main loop for absolutely no reason
It's a visual clue. I'd prefer "while" for that reason.


EDIT: How about this code?

Code: Select all

DestV := 262;
if region then Inc(DestV, 50);
if (not interlace) then begin
        if (status.vcounter = DestV)                            then goto proceed;
end else begin
        if (status.vcounter = DestV + 1)                        then goto proceed;
        if (status.vcounter = DestV    ) and (status.field = 1) then goto proceed;
end;
goto skip;
Last edited by creaothceann on Mon Dec 15, 2008 9:15 pm, edited 2 times in total.
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
ZH/Franky

Post by ZH/Franky »

Again byuu, you are great. I now, yet again, have a new version of bsnes to (finish) Shin Megami Tensei on. I'm almost at the end.
Very soon, your wonderful emulator will be running Shin Megami Tensei 2.
Yes, SMT is my favourite SNES game.

A great reason to compile a new version of bsnes on my linux box too (on Windows I have .037a, but on Linux I'm still using .034; that's about to change).
sinamas
Gambatte Developer
Gambatte Developer
Posts: 157
Joined: Fri Oct 21, 2005 4:03 pm
Location: Norway

Post by sinamas »

Try putting this part:

Code: Select all

      //this part is hit one in 680 calls, so optimizing it won't help much 
      status.hcounter = 0; 
      status.vcounter++; 
      if((region() == 0 && interlace() == false && status.vcounter == 262) 
      || (region() == 0 && interlace() == true  && status.vcounter == 263) 
      || (region() == 0 && interlace() == true  && status.vcounter == 262 && status.field == 1) 
      || (region() == 1 && interlace() == false && status.vcounter == 312) 
      || (region() == 1 && interlace() == true  && status.vcounter == 313) 
      || (region() == 1 && interlace() == true  && status.vcounter == 312 && status.field == 1) 
      ) { 
        status.vcounter = 0; 
        status.field = !status.field; 
      } 
 
      scanline();
in a separate non-inline function.

Consider merging counters into a single counter and use offsets from this counter for particular counters if the offsets don't need to be calculated every tick.

If possible, use a single counter and a single "next necessary update count" variable to batch updates. A hierarchical event system may be even better.

I'm guessing these things may be impractical with the current architecture or that they would require compromising clarity and maintainability. I'm just airing my thoughts here in case you'd find them helpful.

I don't have time to get detailed at the minute (or look at lots of bsnes code), but, if there's anything, feel free to ask and I'll try to answer as best as I can when I have time.
Last edited by sinamas on Mon Dec 15, 2008 9:41 pm, edited 1 time in total.
tukuyomi
Rookie
Posts: 39
Joined: Mon Aug 02, 2004 5:14 am
Contact:

Post by tukuyomi »

http://kuro-hitsuji.net/~tukuyomi/stuff ... french.zip
French locale for bsnes v0.038.
I left untranslated strings because of my lack of programming experience. If someone has the knowledge, feel free to modify my locale file.
Lines 175-179:

Code: Select all

"Export memory" = "Export memory"
"Toggle S-CPU tracing" = "Toggle S-CPU tracing"
"Toggle S-CPU trace mask" = "Toggle S-CPU trace mask"
"Toggle S-SMP tracing" = "Toggle S-SMP tracing"
"Toggle S-SMP trace mask" = "Toggle S-SMP trace mask"
Line 226:

Code: Select all

"Export data:" = "Export data:"
creaothceann
Seen it all
Posts: 2302
Joined: Mon Jan 03, 2005 5:04 pm
Location: Germany
Contact:

Post by creaothceann »

<offtopic>
Version 2 of the locale.cfg I posted was the wrong one; if you got that one please clear your caches and download again.
</offtopic>
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
byuu

Post by byuu »

EDIT: How about this code?
Haven't learned Moonspeak yet, sorry :P j/k
Very soon, your wonderful emulator will be running Shin Megami Tensei 2.
Just gotta work out the bugs :P
Yes, SMT is my favourite SNES game.
Ugh, those games have horrible Huffman tables.

My favorites are SMT: DC Black+White, too bad they weren't for the SNES, or compatible with the SGB.
Try putting this part in a separate non-inline function.
tick() is only called in one place, but okay. I'll post results in a bit.

EDIT: no luck :(
With the E4600, set frameskip to 9 to rule out S-PPU overhead. I get 160.5fps either way. The same thing with the above black magic vcounter position testing.
Consider merging counters into a single counter and use offsets from this counter for particular counters if the offsets don't need to be calculated every tick.
I wanted to do that, but unfortunately both the hcounter clocks per scanline, and vcounter scanlines per frame, change based on field and interlace settings.

I tried a trick where I treated all scanlines as 1364, rather than 1360 and 1364, and skipped an extra four on said scanline ... it helped speed up IRQ testing, but I couldn't get all of my edge case test ROMs to pass :/
If possible, use a single counter and a single "next necessary update count" variable to batch updates.
Batching could certainly work. I still need to write out each counter value in the history buffer due to aforementioned complexity in calculating the counter positions, so I'm not sure how much of a performance advantage that would offer ...

But batching is pretty easy by just detecting when the clocks to add will wrap past the end of the scanline, and then splitting that case into two batches. That may be worth it. It would also require range testing the IRQ latch positions as well (I test against the counter positions after each tick() call) ... which is what's given me nightmares in the past.
I'm guessing these things may be impractical with the current architecture or that they would require compromising clarity and maintainability.
It's more a problem of difficulty with the batch / range testing stuff. There was a good ~50 or so pages in the old bsnes thread where I was trying to get that working and failed.

Given the relative obscurity of the code, and the tremendous impact to speed, if we could get it working I'd go with the speed over clarity in this particular case. I'm reasonably confident I have the IRQ timing perfected at this point.
I don't have time to get detailed at the minute (or look at lots of bsnes code), but, if there's anything, feel free to ask and I'll try to answer as best as I can when I have time.
Thanks, I appreciate the examples above.

I was hoping there'd be a way to use the same approach and still speed it up, but I realize that's fairly unlikely. Going to require a radically different approach to accelerate.
Last edited by byuu on Mon Dec 15, 2008 10:23 pm, edited 2 times in total.
Locked