SNES emulation speed

Announce new emulators, discuss which games run best under each emulator, and much much more.

Moderator: General Mods

Exophase
Hazed
Posts: 77
Joined: Mon Apr 28, 2008 10:54 pm

Post by Exophase »

byuu; By the way, one other thing I noticed with your code. Are you using gcc to compile this? Because if so, it can't do very much with cross-module optimization, and even denoting things like get_palette as inline won't make it inlined if it's declared in another module. You'd want to move inlined functions that are used by multiple modules into header files instead.

Since you're calling this (or the direct color one) every pixel it could make a huge difference if it's turning out to be a function call (if you are indeed using GCC).
blargg
Regular
Posts: 327
Joined: Thu Jun 30, 2005 1:54 pm
Location: USA
Contact:

Post by blargg »

byuu doesn't need no silly cross-module optimization when everything is compiled as a single translation unit (courtesy of #include)! Or at least it was that way last time I checked.
Exophase
Hazed
Posts: 77
Joined: Mon Apr 28, 2008 10:54 pm

Post by Exophase »

blargg wrote:byuu doesn't need no silly cross-module optimization when everything is compiled as a single translation unit (courtesy of #include)! Or at least it was that way last time I checked.
Oh, is that so.. never mind then.

Well, I'm sure an optimization for little endian platforms to do that 16bit palette load in once instead of two 8bit ones wouldn't hurt. I know it's obvious but such a small amount of conditional code (and almost everything is little endian these days)
byuu

Post by byuu »

blargg, I really don't understand the fetish C/C++ developers have for making object files out of every last source code file. And since I have files named memory.cpp in at least three different folders, yet put all my object files inside src/obj/, I'd have to come up with even more "clever" Makefile transformations to pull that off.

The S-PPU is a logical unit. It stands to reason the code is never going to grow 10x larger than it is, and the entire unit compiles in ~50ms, tops. And of course, as Exophase pointed out, GCC still can't handle C++ link-time code generation for cross-module inlining.

The only two objects that are painful to compile at 2-4 seconds each are scpu.o and ui_qt.o. And that's mostly because GCC is slow as hell. Visual C++ is 3-5x faster. If anything, it's otherwise faster. I don't have to shell additional compilation processes, GCC doesn't have more files to link together, and header files don't have to be redundantly parsed again and again. And yes, aside from the below I've been moving to avoid including any unnecessary header files.

Exophase, that's a real problem I have already. I inline every really critical function inside header files. This resorts in all kinds of interesting requirements for specific ordering of header files. Eg sBus::read() calls Cheat::read(), so cheat.hpp has to be included first. Cheat calls something else, another calls something else after that, and you end up with PPUCounter that has to implement 'shadow' functions that do nothing but pull in variables from other classes so that I can inline the hottest part of the code in the header file.

If I had LTCG with GCC, I'd just remove it all and only release profiled LTCG builds.
Well, I'm sure an optimization for little endian platforms to do that 16bit palette load in once instead of two 8bit ones wouldn't hurt.
I had that initially, specifically for palette reading, as a read16() macro that was created based off ARCH_LSB's define setting. Then I got rid of the global 'misc functions go here' bbase.h file in favor of nall (a template library, think Loki or boost), and ended up dropping the requisite functions. I recall experiencing not even a 1fps drop when I removed the macro, so I didn't worry about it.

Yes, I'm sure a dozen of these "imperceptible" changes would add up to a nice small speed boost. Something I should do for the sake of it, and in this case it'd help code readability:

Code: Select all

uint16_t color = memory::vram[addr] + (memory::vram[addr + 1] << 8);
vs.

Code: Select all

uint16_t color = read16LE(&memory::vram[addr]);
sinamas
Gambatte Developer
Gambatte Developer
Posts: 157
Joined: Fri Oct 21, 2005 4:03 pm
Location: Norway

Post by sinamas »

Exophase wrote:That's not generating it automatically, you're still explicitly instantiating all the combinations in the table declaration.

However, with some additional setup you might be able to nest the instantiations at compile time and make the growth linear instead of exponential. If you can't do it cleanly with templates then you can probably do it with macros over templates.
This should work:

Code: Select all

template<bool a, bool b, bool c> void func() {}
enum { N = 2 * 2 * 2 };
void (*funcs[N])();

template<int n>
void genFuncs() {
	funcs[n] = func<n & 4, n & 2, n & 1>;
	genFuncs<n+1>();
}

template<>
void genFuncs<N>() {}

void initFunc() { genFuncs<0>(); }
void func(bool a, bool b, bool c) { funcs[a * 4 + b * 2 + c](); }
Exophase
Hazed
Posts: 77
Joined: Mon Apr 28, 2008 10:54 pm

Post by Exophase »

Ha, nice one. One of those things I wouldn't be able to do with macros (I don't use C++ usually)
spiller
JSNES Developer
JSNES Developer
Posts: 43
Joined: Sun Mar 15, 2009 11:09 pm
Location: Ireland

Post by spiller »

Hi ZSNES folks. Thought I'd post back here and say hello and post a collage-like screenshot of something vaguely resembling progress that I achieved on my portable Java emulator in the last three weeks. It has no controllers, no HDMA, and half the SPC (I hate that bloody chip) is still unimplemented, but I have some vague semblance of a PPU now, though it has no sprites, no hi-res, no interlace, no overscan, no mode 7, no direct color mode, no extra tilemaps vertically, no color arithmetic, no clipping, and various glitches. But the program looks quite nice, right? I have big plans for the debugger, though most of that's still greyed-out/unimplemented.

Image
http://img171.imageshack.us/my.php?image=jsnes4.png
grinvader
ZSNES Shake Shake Prinny
Posts: 5632
Joined: Wed Jul 28, 2004 4:15 pm
Location: PAL50, dood !

Post by grinvader »

spiller wrote:looks quite nice, right?
hhahahahahaha

i like your style
皆黙って俺について来い!!

Code: Select all

<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
Pantheon: Gideon Zhi | CaitSith2 | Nach | kode54
spiller
JSNES Developer
JSNES Developer
Posts: 43
Joined: Sun Mar 15, 2009 11:09 pm
Location: Ireland

Post by spiller »

Boom yada yada! Boom de yada! :D :D :D

Image

Took me three days to pass that, but it fixed oodles of bugs.

BTW it turns out that a SNES emulator in Java is quite feasible, and *just about* fast enough!
Nach
ZSNES Developer
ZSNES Developer
Posts: 3904
Joined: Tue Jul 27, 2004 10:54 pm
Location: Solar powered park bench
Contact:

Post by Nach »

spiller wrote: BTW it turns out that a SNES emulator in Java is quite feasible, and *just about* fast enough!
Compile it using GCJ instead of Sun's JavaC, and use -O3 -march=native, and yeah it should be pretty fast if you didn't make huge programming blunders.
May 9 2007 - NSRT 3.4, now with lots of hashing and even more accurate information! Go download it.
_____________
Insane Coding
byuu

Post by byuu »

Took me three days to pass that, but it fixed oodles of bugs.
Congrats! Geez, it took me nine months to get that thing passing.

Here's hoping you'll keep at it and make it into a really competitive offering :D
BTW it turns out that a SNES emulator in Java is quite feasible, and *just about* fast enough!
Hmm, what processor speed and how many frames per second? Loose SA-1 / SuperFX support should be ~40% more demanding, would be interesting to know if those are possible on a top-end E8x00 with the JVM.
Compile it using GCJ instead of Sun's JavaC, and use -O3 -march=native, and yeah it should be pretty fast if you didn't make huge programming blunders.
That's definitely cool. If it weren't for the pointer hate and lack of fibers, I'd really find Java compelling.

Speaking of which, anyone worried now that Oracle owns Java? IBM was damn foolish to let that happen.
Rashidi
Trooper
Posts: 515
Joined: Fri Aug 18, 2006 2:45 pm

Post by Rashidi »

byuu wrote:Speaking of which, anyone worried now that Oracle owns Java? IBM was damn foolish to let that happen.
orcale finaly buys sun, huh.
since oracle well known for their bloat-ware products, now java will even get more bloated than already is.
  • oh no, its.. its.. its the bloat-power!!
spiller
JSNES Developer
JSNES Developer
Posts: 43
Joined: Sun Mar 15, 2009 11:09 pm
Location: Ireland

Post by spiller »

Nach wrote:Compile it using GCJ instead of Sun's JavaC, and use -O3 -march=native, and yeah it should be pretty fast if you didn't make huge programming blunders.
I'm sure it would. But I'm drawn by the portability of compiled class files. That's why I've chosen to learn Java at all. And seeing the animated Rareware logo and Donkey Kong Country title screen appear in Mozilla Firefox makes it all worth while!
byuu wrote:Congrats! Geez, it took me nine months to get that thing passing.
Looks like we got stuck on the same tests. :) $73 was awful. I never expected games to rely on the range over / time over stuff so I hadn't planned for it, then I tacked it on afterwards and spent hours trying to figure out why it wasn't working (there was the tiniest of errors in the tile range checking logic).

It was a huge relief when I finally successfully reached the end of the ~800 MB trace logs from the tests from each of Snes9x and my emulator. (It was a while before I even got as far as the "FAILED" screen. At first it just threw garbage onto the screen and got stuck. But those tests really were incredibly helpful.)
byuu wrote:Hmm, what processor speed and how many frames per second?
Like I say, *just about* fast enough. It's very tight. Currently, CPU emulation per frame is about 2.5 milliseconds. APU emulation (no DSP!) is about 0.3 milliseconds, and the PPU is about 2 to 6 milliseconds depending on visible background layers.

Blitting the generated bitmap to the screen is taking a thoroughly disappointing 4.5 milliseconds, even though it's just a simple 2x pixel resize. Turn on linear filtering and it jumps to 60 ms. Try to maximize the window and it can no longer keep up. I've done (almost) everything I can think of here but Java's drawImage method just seems to like to be deliberately slow. Ultimately, I may have to skip rendering some frames. :(

Anyway, for NTSC games at 60.09 fps that's 16.64 ms per frame leaving a 3 to 8 ms pause. I expect I'm going to need all of that spare time for the still unimplemented PPU effects (windows, color arithmetic, transparency, mode 7, etc.) and the DSP (currently, no sound is generated, and from what I've heard Java's audio pipeline is not well optimized).

I still have a few little optimization tricks left to try though.

I'm on an AMD Sempron 2600+ clocked at 1.75 GHz. This should serve as a good low end basemark that will force me to get this emulator well optimized. On a newer PC it'll run great.
byuu wrote:Loose SA-1 / SuperFX support should be ~40% more demanding, would be interesting to know if those are possible on a top-end E8x00 with the JVM.
I'm sure that would be possible, but you mentioned before that SA-1 support would require a cycle-stepped CPU. I don't want to worry about that yet, so I'm ignoring the SA-1. I do want to get the SuperFX working though, to run Star Wing/Fox.
If it weren't for the pointer hate and lack of fibers, I'd really find Java compelling.
I don't miss pointers. What I miss are unsigned datatypes and structs (value types). Objects are fricking slow!!
Nach
ZSNES Developer
ZSNES Developer
Posts: 3904
Joined: Tue Jul 27, 2004 10:54 pm
Location: Solar powered park bench
Contact:

Post by Nach »

byuu wrote: Speaking of which, anyone worried now that Oracle owns Java?
Not in the slightest. So what if they own Java? Who cares who owns the language if there's plenty of compilers for it out there?

I'm more worried about Solaris, ZFS, DTrace, Open Office, and MySQL.
spiller wrote:
Nach wrote:Compile it using GCJ instead of Sun's JavaC, and use -O3 -march=native, and yeah it should be pretty fast if you didn't make huge programming blunders.
I'm sure it would. But I'm drawn by the portability of compiled class files. That's why I've chosen to learn Java at all. And seeing the animated Rareware logo and Donkey Kong Country title screen appear in Mozilla Firefox makes it all worth while!
You can still have a bytecode version for playing in your browser or portability (as much as the VM is portable), but it's nice to have a fast version too. It's not like you have to modify the source code.
May 9 2007 - NSRT 3.4, now with lots of hashing and even more accurate information! Go download it.
_____________
Insane Coding
byuu

Post by byuu »

Yeah, seriously.

Drop Swing for SWT so your UI doesn't look so alien, and offer some native compiled Win32 binaries, and this could become quite competitive :D

3x speed on a Sempron 2600+ in bytecode form is great. Should be even better compiled.
I'm sure that would be possible, but you mentioned before that SA-1 support would require a cycle-stepped CPU.
Did I? I actually don't notice any stability issues even when I only sync per opcode. But even at the cycle level, it seems to take ~40% more CPU power than regular games on an in-order Atom, and ~80% more on a Core 2. That's because cache thrashing from synchronization is much worse on the Core 2.
I do want to get the SuperFX working though, to run Star Wing/Fox.
That chip looks like a much bigger nightmare than the SA-1. Have to write and debug an entirely new opcode core, in addition to all the chip's extras.
$73 was awful. I never expected games to rely on the range over / time over stuff so I hadn't planned for it, then I tacked it on afterwards and spent hours trying to figure out why it wasn't working
And when you implement frame-skipping, remember that not computing RTO will cause the electronics test to fail. Yet RTO computation eats up half the speed-up from frame-skipping in the first place.
Not in the slightest. So what if they own Java? Who cares who owns the language if there's plenty of compilers for it out there?
They could pull a Microsoft.NET and make radical changes to it constantly until it bores them, and drop it like FoxPro, or J++, or VB6, or ...
Nach
ZSNES Developer
ZSNES Developer
Posts: 3904
Joined: Tue Jul 27, 2004 10:54 pm
Location: Solar powered park bench
Contact:

Post by Nach »

byuu wrote: They could pull a Microsoft.NET and make radical changes to it constantly until it bores them, and drop it like FoxPro, or J++, or VB6, or ...
So?

90% of the people out there are still programming according to Java 1.3, and if they make radical changes to it, whose going to bother with them?
Really, so what if they fork it?

As for J++ MS had to drop it because of Sun's lawsuit.

Look at C or C++, are we programming with them according to AT&T's initial implementations? Or do we follow the standards committee?

Look at ECMAScript/JavaScript, MS made their own "JScript" and hoped everyone would use it. People out there don't bother with their nonsense, because they'd rather follow standards and have a wider program, or they're copying examples from other web sites which don't bother with any JScript nonsense.

Since Java is primarily used in a VM across a bunch of platforms, if Oracle decides to go change things, which VMs are going to follow suit? If they don't follow, Oracle could say whatever it wants, but they're a fish out of water. Remember Sun's is not the only VM, MS and IBM have a VM too.


Really the big thing to worry about here is MySQL, why would Oracle want to allow that to live? If not for anything else, buying MySQL just to kill it off would probably be worth $7B, everything else is just icing on the cake.

We'll also probably seeing an Oracle OS soon which is super optimizing for their DB from the ground up. Their stuff always ran best on Solaris, but now that can really tweak all the trade offs that Sun worried about for other applications, which Oracle doesn't.
May 9 2007 - NSRT 3.4, now with lots of hashing and even more accurate information! Go download it.
_____________
Insane Coding
AamirM
Regen Developer
Regen Developer
Posts: 533
Joined: Sun Feb 17, 2008 8:01 am
Contact:

Post by AamirM »

Nach wrote:buying MySQL just to kill it off would probably be worth $7B, everything else is just icing on the cake.
Its GPL'ed, it can't be killed. Someone (some company maybe) will probably fork it.

@spiller:
Congrats on your achievement. :)
Nach
ZSNES Developer
ZSNES Developer
Posts: 3904
Joined: Tue Jul 27, 2004 10:54 pm
Location: Solar powered park bench
Contact:

Post by Nach »

AamirM wrote:
Nach wrote:buying MySQL just to kill it off would probably be worth $7B, everything else is just icing on the cake.
Its GPL'ed, it can't be killed. Someone (some company maybe) will probably fork it.
It can be killed in the sense they can pollute the market with flaky newer servers that people will attempt to upgrade to, and then lose data.

Doing so will make people want to switch to another SQL entirely after being burned by MySQL instead of looking for patches or forks.

Also, if it's done under the guise of good new versions, which slowly get worse and worse, the market as a whole will keep up with the versions until it just gets too bad.
May 9 2007 - NSRT 3.4, now with lots of hashing and even more accurate information! Go download it.
_____________
Insane Coding
Gil_Hamilton
Buzzkill Gil
Posts: 4294
Joined: Wed Jan 12, 2005 7:14 pm

Post by Gil_Hamilton »

AamirM wrote:
Nach wrote:buying MySQL just to kill it off would probably be worth $7B, everything else is just icing on the cake.
Its GPL'ed, it can't be killed. Someone (some company maybe) will probably fork it.
I've seen GPL projects die.

MySQL probably wouldn't, though.
pagefault
ZSNES Developer
ZSNES Developer
Posts: 812
Joined: Tue Aug 17, 2004 5:24 am
Location: In your garden

Post by pagefault »

Is this project still being worked on? I ported ZSNES to java but I haven't gotten that far yet, we could collaborate.
Watering ur plants.
creaothceann
Seen it all
Posts: 2302
Joined: Mon Jan 03, 2005 5:04 pm
Location: Germany
Contact:

Post by creaothceann »

vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
Post Reply