I always wonder how you guys gauge accuracy in closed source emulators. For all you know, there could be game-specific hacks for every single title. How would you know? You trust an author who already has something to hide?The more accurate comparison would have been no$GBA
Taking dynarecs one step further: Just-in-time assembly?
Moderator: General Mods
I said more accurate, not totally accurate...
>.>
Edit: Given his insanely detailed GBA documentation( http://nocash.emubase.de/gbatek.htm ), I doubt he halfassed the emu completely...
Of course, I could be wrong.
>.>
Edit: Given his insanely detailed GBA documentation( http://nocash.emubase.de/gbatek.htm ), I doubt he halfassed the emu completely...
Of course, I could be wrong.
-
- ZSNES Shake Shake Prinny
- Posts: 5632
- Joined: Wed Jul 28, 2004 4:15 pm
- Location: PAL50, dood !
Hmm, now you're thinking with port... err, with a simili-Nach wyriwym sentence parsing. ;)byuu wrote:You said: "(observe for example, GBA emulator compatibility is far higher compared to the SNES)."
Not: "(observe for example, VBA emulator compatibility is far higher compared to ZSNES)."
Hence, I wasn't mincing anything with my words. If the latter was what you meant, how was I to predict your misspoken statement?
皆黙って俺について来い!!
Pantheon: Gideon Zhi | CaitSith2 | Nach | kode54
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
That's still in line with what I was saying. How do you know it's more accurate, then?I said more accurate, not totally accurate...
Oh god, dear god no ... I've been talking to Nach so much that ... no! Lies! I will hear no more of this! >_<Hmm, now you're thinking with port... err, with a simili-Nach wyriwym sentence parsing. ;)
NOTE: THE FOLLOWING POST MAY CONTAIN FRANPA LEVELS OF BULLSHITbyuu wrote:That's still in line with what I was saying. How do you know it's more accurate, then?I said more accurate, not totally accurate...
Wouldn't hacks for every single GBA game take up a huge amount of space in the exe? I compared the size no$GBA's exe(The latest free version, 2.6.) to zsnes' exe.
Now, when you consider the fact that it also runs DS games with fairly high compatibility... It doesn't add up to me.
Of course, as mentioned in the disclaimer, I could very well be talking out my ass.
-
- ZSNES Shake Shake Prinny
- Posts: 5632
- Joined: Wed Jul 28, 2004 4:15 pm
- Location: PAL50, dood !
Hacks are usually small. Very small - it's just a matter of adding/skipping checks to do something different in specific cases, so it's somewhere in the line of 20 bytes of binary per hack tops, before upx'ing the hell out of it.
Hackish solutions are horribly inaccurate yet very simple, which is why they're used - if it was harder to hack through than do stuff right, noone would do it.
@byuu: consider it a good thing ? ;)
Hackish solutions are horribly inaccurate yet very simple, which is why they're used - if it was harder to hack through than do stuff right, noone would do it.
@byuu: consider it a good thing ? ;)
皆黙って俺について来い!!
Pantheon: Gideon Zhi | CaitSith2 | Nach | kode54
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
Even so, hacks for every single game would add up to over 150 kb, given that the GBA has over 1000(Possible up to 2000) unique games, and that no$GBA also supports at least 100 DS games.
Edit: Might be a bit off here... Now, given 20 bytes per hack, and 2803 GBA dumps so far, that adds up to 56,060. However, considering that there are over 2000 DS games...
Edit: Might be a bit off here... Now, given 20 bytes per hack, and 2803 GBA dumps so far, that adds up to 56,060. However, considering that there are over 2000 DS games...
-
- Seen it all
- Posts: 2302
- Joined: Mon Jan 03, 2005 5:04 pm
- Location: Germany
- Contact:
NO$GMB was written in Assembler, so I'd be wary of size comparisons.
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
bsnes launcher with recent files list
-
- ZSNES Developer
- Posts: 6747
- Joined: Tue Dec 28, 2004 6:47 am
-
- ZSNES Shake Shake Prinny
- Posts: 5632
- Joined: Wed Jul 28, 2004 4:15 pm
- Location: PAL50, dood !
Gonna insist on the fact you probably overestimate what a hack is.
皆黙って俺について来い!!
Pantheon: Gideon Zhi | CaitSith2 | Nach | kode54
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
Hi,
I wrote this on another forum some time ago and now copy pasting it here
stay safe,
AamirM
I wrote this on another forum some time ago and now copy pasting it here
I am sure you can write a much faster emulator by optimizing video and sound hardware than writing CPU JIT compiler or dynarec in less time. In Regen, the most of the time (24% reported by gprof) is taken by not even the VDP but by the YM2612 emulation. But if you still want to go on I would recommend that you try and use libjit (google it) to implement it as it will save you much time....I tried to write a M68000 dynamic recompiler for my Nugen(Neogeo, CPS1/2) and soon to be released Regen(Sega Genesis) emus but soon realized that it was not worth it. Although the increased speed is there but not by much(after 800Mhz Pentium) from the fastest 68k interpreted emu, the A68K, but its implementation is much harder than an interpreter. If I remeber correctly, the author of Generator emulator got similar results after writing a 68k recompiler for ARM processor. You can read his report from www.squish.net.
stay safe,
AamirM
stay safe,
AamirM
-
- Hazed
- Posts: 77
- Joined: Fri Mar 21, 2008 12:52 am
I perceive that my aims are being misinterpreted by most as dynarec as defined by past efforts. Let me be clear about what I'm trying to accomplish: I'm aiming for dynamic translation. You know how Babelfish translates text from one spoken language to another? That's what I'm aiming for, but in terms of computer programming. A faithful translation, I theorize, would incur minimal speed loss between platforms. Indeed, the more features are common between platforms, the greater the performance of the translated code.
But I would like to see how you implemented your dynarec, AamirM.
But I would like to see how you implemented your dynarec, AamirM.
Are you trying to do something like this ?tcaudilllg2 wrote:I'm aiming for dynamic translation. You know how Babelfish translates text from one spoken language to another? That's what I'm aiming for, but in terms of computer programming.
"move.l x, y" (M68000)
gets translated to:
"mov y, x" (x86)
So you are doing text processing?
Sorry, if I misinterpreted.
Hi,
Here is what I did in my dynarec. Its just an overview. My terminology may be different:
There were two parts in it, the frontend and backend. The frontend mainly contained a basic
interpreter (not using jumptable), code to detect a code block (a code block can
start at any instruction but will end when a PC modifying opcode is encountered) and some
other things. The backend is the compiler. It will compile the block (not individual instructions)
to the native CPU's instructions to, in theory, perform the same operation. The native code was created
on stack on x86.
The interpreter was there so that if the dynarec was running on a CPU for which there was no
backend for, it would still run. Secondly it was also used to run a block specific number of
times before that block is recompiled (for self-modifying code). Lastly it was also used to
verify and validate if the compiled and interpreted code were indeed doing the same thing.
Code block information was kept in a list which specified thier range and some other
things which were used mainly for optimizations. All memory accesses would go through
a special function to see if a block was being modified in which case that block was invalidated
and its code flushed. This is pretty brute-force method of handling self-modyfing code.
The optimizations part was there to do optimizations such as dead flag calculation removal
but I miserably failed at doing it.
stay safe,
AamirM
Here is what I did in my dynarec. Its just an overview. My terminology may be different:
There were two parts in it, the frontend and backend. The frontend mainly contained a basic
interpreter (not using jumptable), code to detect a code block (a code block can
start at any instruction but will end when a PC modifying opcode is encountered) and some
other things. The backend is the compiler. It will compile the block (not individual instructions)
to the native CPU's instructions to, in theory, perform the same operation. The native code was created
on stack on x86.
The interpreter was there so that if the dynarec was running on a CPU for which there was no
backend for, it would still run. Secondly it was also used to run a block specific number of
times before that block is recompiled (for self-modifying code). Lastly it was also used to
verify and validate if the compiled and interpreted code were indeed doing the same thing.
Code block information was kept in a list which specified thier range and some other
things which were used mainly for optimizations. All memory accesses would go through
a special function to see if a block was being modified in which case that block was invalidated
and its code flushed. This is pretty brute-force method of handling self-modyfing code.
The optimizations part was there to do optimizations such as dead flag calculation removal
but I miserably failed at doing it.
stay safe,
AamirM
-
- Hazed
- Posts: 77
- Joined: Fri Mar 21, 2008 12:52 am
That, and adding in hardware simulation code in between the instructions where it would be invoked. (if the source platform and the target were markedly dissimilar.)AamirM wrote:Are you trying to do something like this ?tcaudilllg2 wrote:I'm aiming for dynamic translation. You know how Babelfish translates text from one spoken language to another? That's what I'm aiming for, but in terms of computer programming.
"move.l x, y" (M68000)
gets translated to:
"mov y, x" (x86)
So you are doing text processing?
Sorry, if I misinterpreted.
The idea is this:
- identify instruction, translate it to its equivalent on the target platform
- code the target platform to mimic the behavior of the source platform's hardware in response to the instruction. For example, a sprite flip bit on a console would need to be reproduced on a PC the same way it is on an emulator: by rearranging the sprite's data.
I wonder if your code would perform better without the self-modification checks. Absolutely the quest for accuracy would kill any performance gains otherwise obtained. (or so I suspect) That, and if you aren't translating as much of the hardware functionality as you can, you're not going to get good gains.
-
- ZSNES Shake Shake Prinny
- Posts: 5632
- Joined: Wed Jul 28, 2004 4:15 pm
- Location: PAL50, dood !
... flipping doesn't rearrange anything. That's THE POINT. You just read the same data, but in a different direction.tcaudilllg2 wrote:For example, a sprite flip bit on a console would need to be reproduced on a PC the same way it is on an emulator: by rearranging the sprite's data.
If flipping tiles rearranged their data, you'd lose all the advantage of reusing the same data over and over (and you'd fill the OAM in no time).
皆黙って俺について来い!!
Pantheon: Gideon Zhi | CaitSith2 | Nach | kode54
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
-
- Hazed
- Posts: 77
- Joined: Fri Mar 21, 2008 12:52 am
Indeed. But if you were going to port a SNES game to PC, would you still read the tile data from a different direction? Not likely, you'd probably flip it around in a buffer, and then replace the tile with its flipped version before blasting it to the display. How do you keep a record of the flip? By setting a flag which you correspond to the flip bit on the original hardware, and have the port refer to the bit by that.grinvader wrote:... flipping doesn't rearrange anything. That's THE POINT. You just read the same data, but in a different direction.tcaudilllg2 wrote:For example, a sprite flip bit on a console would need to be reproduced on a PC the same way it is on an emulator: by rearranging the sprite's data.
If flipping tiles rearranged their data, you'd lose all the advantage of reusing the same data over and over (and you'd fill the OAM in no time).
-
- ZSNES Shake Shake Prinny
- Posts: 5632
- Joined: Wed Jul 28, 2004 4:15 pm
- Location: PAL50, dood !
Hmm, no.tcaudilllg2 wrote:But if you were going to port a SNES game to PC, would you still read the tile data from a different direction? Not likely, you'd probably flip it around in a buffer, and then replace the tile with its flipped version before blasting it to the display.
I'd do exactly as it's done originally - read the tile data from the other way into the output display. No temporary buffer (waste of bytes, waste of time).
That's about right.How do you keep a record of the flip? By setting a flag which you correspond to the flip bit on the original hardware, and have the port refer to the bit by that.
皆黙って俺について来い!!
Pantheon: Gideon Zhi | CaitSith2 | Nach | kode54
Code: Select all
<jmr> bsnes has the most accurate wiki page but it takes forever to load (or something)
-
- Hazed
- Posts: 77
- Joined: Fri Mar 21, 2008 12:52 am
-
- Hazed
- Posts: 77
- Joined: Fri Mar 21, 2008 12:52 am
On the matter of the PPU:
- if the memory address is immediate, in-code switches are not necessary to determine the PPU function to process
- if the memory address is accessed from a register, then full emulation (not simulation) of the entire external system is probably required, because we don't know what the contents of that register are.
Example from x86 ASM:
In this case, we could get a pretty even translation. INT always uses AX, so we know to get whatever is in AX as the parameter for the videomode change. ...Hmm, when I had conceived of this project, I had been thinking in terms of immediate addressing. I hadn't ever actually considered the use of indirect addressing, or what problems it would present.
The question is one of how much can be determined. To make the determinations, rules are necessary:
- the registers must be considered as variables in the produced source (that has already been established). If the code says, &H03 goes in AX, then the register most often used for purposes of interrupt addressing gets the value 0x03.
- the functions of the targeted machine must only use register variables to the extent that the source machine does.
The tricky part is that 0x03 may not be the text mode interrupt on the host machine, in which case you need a correspondence table to know that 0x03. When the interrupt is actually called, you leave 0x03 as it is... but you call instead with the value which corresponds to 0x03 on the target device.
Interrupts (an in/out operations) are simple... memory is more complicated.
When we write to say, $2103, we are doing with memory what would on the PC be done with output operations. This brings with it a problem, it that we must treat RAM both as memory and hardware output. The solution is to equate the hardware RAM regions with the hardware itself. [more later]
- if the memory address is immediate, in-code switches are not necessary to determine the PPU function to process
- if the memory address is accessed from a register, then full emulation (not simulation) of the entire external system is probably required, because we don't know what the contents of that register are.
Example from x86 ASM:
Code: Select all
Mov AX, &H03 // sets the video mode
INT &H10
The question is one of how much can be determined. To make the determinations, rules are necessary:
- the registers must be considered as variables in the produced source (that has already been established). If the code says, &H03 goes in AX, then the register most often used for purposes of interrupt addressing gets the value 0x03.
- the functions of the targeted machine must only use register variables to the extent that the source machine does.
The tricky part is that 0x03 may not be the text mode interrupt on the host machine, in which case you need a correspondence table to know that 0x03. When the interrupt is actually called, you leave 0x03 as it is... but you call instead with the value which corresponds to 0x03 on the target device.
Interrupts (an in/out operations) are simple... memory is more complicated.
When we write to say, $2103, we are doing with memory what would on the PC be done with output operations. This brings with it a problem, it that we must treat RAM both as memory and hardware output. The solution is to equate the hardware RAM regions with the hardware itself. [more later]
-
- Seen it all
- Posts: 2302
- Joined: Mon Jan 03, 2005 5:04 pm
- Location: Germany
- Contact:
I'm not sure I understand you here...
Note that the target machine has typically completely different hardware than the emulated machine, so direct translations of "function arguments" won't suffice. Instead you'll have to simulate the hardware too.tcaudilllg2 wrote:The question is one of how much can be determined. To make the determinations, rules are necessary:
- the registers must be considered as variables in the produced source (that has already been established). If the code says, &H03 goes in AX, then the register most often used for purposes of interrupt addressing gets the value 0x03.
- the functions of the targeted machine must only use register variables to the extent that the source machine does.
The tricky part is that 0x03 may not be the text mode interrupt on the host machine, in which case you need a correspondence table to know that 0x03. When the interrupt is actually called, you leave 0x03 as it is... but you call instead with the value which corresponds to 0x03 on the target device.
You should treat the read/write accesses as a CPU output facility ("in"/"out" with x86 ASM), and RAM/ROM only as components that are mapped into the address space.tcaudilllg2 wrote:When we write to say, $2103, we are doing with memory what would on the PC be done with output operations. This brings with it a problem, it that we must treat RAM both as memory and hardware output. The solution is to equate the hardware RAM regions with the hardware itself.
vSNES | Delphi 10 BPLs
bsnes launcher with recent files list
bsnes launcher with recent files list