bsnes v0.038 released

h4tred · Post by **h4tred** » Sat Jan 03, 2009 11:32 pm

Apparently in his lexicon, finding some shortcuts, not even order-of-magnitude shortcuts, equals "busted".

No, I am talking fully reverse engineered solutions to these hashes.

Anyway, PMed byuu with the relevant info.

FitzRoy · Post by **FitzRoy** » Sat Jan 03, 2009 11:55 pm

byuu wrote:I guess what I'm saying is ... thank you for crushing my hopes

Np

You make a good point about functioning in emulators... the longer the hash, the more that will need to be modified, hindering its ability to go unnoticed.

Still ... take every possible combination of 65-byte files, and I guarantee you there's a collision with one of them when using SHA-512.

We want to get the hash function right the first time, so we need to pick the best method that will stand up to time as possible. What if we went with a custom 4k-byte hashing function?

Besides the functionality test, remember that people are only going to be targeting SNES files with an SNES database. The relevant pool of data is not trillions of combinations, it's whatever users are distributing and downloading under the SNES umbrella. CRC32 and filesize alone would probably be sufficient, your worries are unfounded.

Tell you what, take a common bad checksum, spoof it to match the good version's CRC32, and if it works in your emulator for 60 seconds, we'll go to CRC+MD5. Then you'll have to do the same there and we'll use your proprietary one.

ecst · Post by **ecst** » Sun Jan 04, 2009 1:39 am

FitzRoy wrote:CRC32 and filesize alone would probably be sufficient, your worries are unfounded.

CRC is a transmission error detecting code (and a good and simple one for that), but was never intended to protect against malicious alteration of data.

FitzRoy wrote:Tell you what, take a common bad checksum, spoof it to match the good version's CRC32

In fact, give me any arbitrary file, choose any contiguous block of 32 bits for me to alter, and I will produce you a file having any CRC32 checksum you want.

FitzRoy wrote:and if it works in your emulator for 60 seconds

Should be clear by now that this is trivial to achieve.

FitzRoy wrote:we'll go to CRC+MD5.

... which will probably be not any more secure than MD5 alone.

FitzRoy wrote:Then you'll have to do the same there and we'll use your proprietary one.

Seriously, I suggest working with standard signed certificates. It is the chain of trust model that should be discussed.

FitzRoy · Post by **FitzRoy** » Sun Jan 04, 2009 3:55 am

ecst wrote:... which will probably be not any more secure than MD5 alone.

Simultaneously spoofing 2 different algorithms (plus the internal one) without affecting program functionality is no cakewalk. Show me the money.

Gil_Hamilton · Post by **Gil_Hamilton** » Sun Jan 04, 2009 10:46 am

FitzRoy wrote:Show me the money.

h4tred · Post by **h4tred** » Sun Jan 04, 2009 10:49 am

Simultaneously spoofing 2 different algorithms (plus the internal one) without affecting program functionality is no cakewalk.

How would you know?

Got any evidence?

... which will probably be not any more secure than MD5 alone.

That I have to agree with. CRC's are easily comprimised, so are hash algorithms like SHA1 and MD5. Which means CRC + MD5 is useless.

funkyass · Post by **funkyass** » Sun Jan 04, 2009 11:37 am

byuu wrote:More importantly, being able to index into each element in O(1) time would also be important to quickly find hashes without reading the entire database into memory and pre-parsing it.

How big can this db get?

We could also split the DB into say, 36 files, basically applying bin sorting at the file system level to speed things up a bit.

How bad could a forged hash be? It not like we are making a OS for a nuclear power plant...

byuu · Post by **byuu** » Sun Jan 04, 2009 3:41 pm

New WIP, this one's fairly big as nightlies go.

First, moved the priority queue to a generic implementation so I can re-use it elsewhere in the future. Took a ~1% speed hit or so by using functors for the callback and using the signed math trick to avoid the need for a normalize() function. Sadly it gets up to 3% slower if the priorityqueue class code isn't placed right next to the CPU core.

Second, while I failed miserably at using the queues for IRQ / NMI testing, I did come up with a neat compromise. NMI is only tested once per scanline, IRQs only have PPU dot precision (every 4 clocks), the hold time for both is four clock cycles, and scanlines for both NTSC and PAL, even on the short colorburst scanline, are always evenly divisible by four.
... so testing every 2 clock cycles was kind of pointless, as it'd always be false. Since the delays between the PPU counter and CPU trigger for NMI is 2, and IRQ is 10, they even align again with an offset of 2.
... hence, I can call poll_interrupts() half as often by using if(ppu.hcounter() & 2). I reverse that for the Super Scope / Justifier dot testing and cut their overhead in half as well.

That gives us a nice ~10-15% speedup. Nowhere near the idealistic ~30-40% for range tested IRQs, because that only actually tests once per scanline (~1364 cycles). This just cuts ~682 tests down to ~341 tests. Still, it's pretty close to half as good while still being super clean and easy. It greatly diminishes the value of a range-based IRQ tester, as that will only offer a ~15-20% speedup now at best. Getting PGO working again is the new lowest-hanging fruit.

I also eked out a tiny bit more speed by adding some previous missed "else" statements in the irq_valid testing part.

With the newfound speed, I gave a tiny bit up (1-2%) to simplify and improve some old edge cases. It's known that IRQs won't trigger on the very last dot of each field. It's due to the way the V and H counters are misaligned, that we can't easily emulate.

So before I had a bunch of cruft to support that, update_interrupts() was called at the start of each scanline, which would call irq_valid() to run a bunch of tests to make sure the latch positions would actually work on hardware. Writes to $4207-420a would also call the update_interrupts() proc.

I killed all that, and now compute the HTIME position inline in poll_interrupts(), and perform the last dot check there. Since testing is ten clocks behind anyway, then we need only check to see if VTIME > 0 and ppu.vcounter(-6 clocks) == 0 to know that it was set for the last dot on any given field.

This gives us two nice perks for free: one, no more need to hard-code scanlines/frame inside the CPU core; and two, the old version was missing an edge case in interlace mode where odd fields would allow an IRQ on the last dot, which was simply because my old irq_valid() test didn't have a third condition for that.

All that said, I'm getting ~157.5fps instead of ~137.5fps now in Zelda 3.

Third, I removed grayscale/sepia/invert from the video settings panel, and stuck them in advanced. Used the new space to add checkboxes for NTSC merge fields and the start in fullscreen thing.

Reference:

Code: Select all

//called once every four clock cycles;
//as NMI steps by scanlines (divisible by 4) and IRQ by PPU 4-cycle dots.
//
//ppu.(vh)counter(n) returns the value of said counters n-clocks before current time;
//it is used to emulate hardware communication delay between opcode and interrupt units.
alwaysinline void sCPU::poll_interrupts() {
  //NMI hold
  if(status.nmi_hold) {
    status.nmi_hold = false;
    if(status.nmi_enabled) status.nmi_transition = true;
  }

  //NMI test
  bool nmi_valid = (ppu.vcounter(2) >= (!ppu.overscan() ? 225 : 240));
  if(!status.nmi_valid && nmi_valid) {
    //0->1 edge sensitive transition
    status.nmi_line = true;
    status.nmi_hold = true;  //hold /NMI for four cycles
  } else if(status.nmi_valid && !nmi_valid) {
    //1->0 edge sensitive transition
    status.nmi_line = false;
  }
  status.nmi_valid = nmi_valid;

  //IRQ hold
  status.irq_hold = false;
  if(status.irq_line) {
    if(status.virq_enabled || status.hirq_enabled) status.irq_transition = true;
  }

  //IRQ test (unrolling the duplicate Nirq_enabled tests causes speed hit)
  bool irq_valid = (status.virq_enabled || status.hirq_enabled);
  if(irq_valid) {
    if((status.virq_enabled && ppu.vcounter(10) != (status.virq_pos))
    || (status.hirq_enabled && ppu.hcounter(10) != (status.hirq_pos + 1) * 4)
    || (status.virq_pos && ppu.vcounter(6) == 0)  //IRQs cannot trigger on last dot of field
    ) irq_valid = false;
  }
  if(!status.irq_valid && irq_valid) {
    //0->1 edge sensitive transition
    status.irq_line = true;
    status.irq_hold = true;  //hold /IRQ for four cycles
  }
  status.irq_valid = irq_valid;
}

FitzRoy · Post by **FitzRoy** » Sun Jan 04, 2009 10:01 pm

h4tred wrote:
Simultaneously spoofing 2 different algorithms (plus the internal one) without affecting program functionality is no cakewalk.
How would you know? Got any evidence?

It's not for me to prove, I'm the one saying I can do it.

h4tred wrote:
... which will probably be not any more secure than MD5 alone.
That I have to agree with. CRC's are easily comprimised, so are hash algorithms like SHA1 and MD5. Which means CRC + MD5 is useless.

First of all, I at least know that the whole is far greater than the sum of its parts. You're making conclusions on separate assessments of each. To spoof one method, you have to change/rearrange data. To spoof the second method, you have to change the data again, but that will screw up the other spoof. Spoofing both simultaneously is much more difficult than spoofing each individually.

But it doesn't stop there as you and ecst erroneously keep suggesting. You'll then need to be able to do this while maintaining a good internal checksum and functionality of the game program. We're not trying to pick a lock here, we're trying to redecorate every home on the block without anyone noticing.

If one more person tries to get out of producing with talk, I'm calling the bout in my favor and we can finally move on with the format.

DataPath · Post by **DataPath** » Sun Jan 04, 2009 11:32 pm

FitzRoy wrote:To spoof one method, you have to change/rearrange data. To spoof the second method, you have to change the data again, but that will screw up the other spoof. Spoofing both simultaneously is much more difficult than spoofing each individually.

QFT.

Think about it - you can generate an md5 collision in about a minute. Ok, now all you need is for the CRC to check out. So you generate a matching CRC and... the MD5 is now wrong.

What'd you'd have to do is generate MD5 collisions over, and over, and over, until you happen to get one where the CRC checks out. On a laptop it would take, at most, brute force, 8,000 years.

Ok, so let's say you're Sony, and you REALLY REALLY hate those rippers out there. Your mission is to pollute the pool with bad images that won't be detected until they're tried. You make a cluster of those PS3's you're having trouble selling, let's say about 100 of them.

Now you're talking. It's down to a maximum of 10 years to get an MD5-CRC collision. Well, you'd like to pollute the pool while the games are still current, so you need SOMETHING else.

Well, usually when you combine hashing methods like this, shortcuts tend to open up. Well, you hire a cryptographer who spends a couple of months and $100,000 of your money (good cryptographers don't work for free, ya know), and he comes up with a 5-order-of-magnitude improvement in finding MD5-CRC collisions. Hallelujah! You've got it down to 45 minutes to produce an MD5-CRC collision, and it only took you $100,000 worth of cryptographer time and a cluster of 100 PS3s.

You're right. It's not worth the effort - it's too easily broken.

h4tred · Post by **h4tred** » Mon Jan 05, 2009 12:17 am

Do I have to risk getting banned just to prove you ignorant idiots wrong?

http://jardinezchezjb.free.fr/

Read the source code in those keygens, then say what you just said.

If one more person tries to get out of producing with talk, I'm calling the bout in my favor and we can finally move on with the format.

Its not your call, is it not? I thought byuu wanted control.

FitzRoy · Post by **FitzRoy** » Mon Jan 05, 2009 1:43 am

Are you even on the same subject as us? This isn't a debate about general security of hashes for all applications. There is no door and key system here, all we have is an open party with a guestbook. It's pre-filled to admit the most desirable guests the fastest, and we've identified them by as many characteristics as necessary. Guests who aren't in the book aren't locked out, they simply have to write themselves in after the party starts or bring a certain card.

Now if you want to sit here and say that uninvented guests in a free walk-in party are going to try and masquerade as someone on the initial list, you're already out of your mind. Then to say that they can surgically change themselves to do so without falling dead on the dance floor is even more insane.

That I have to resort to analogies with seasoned programmers is ridiculous. I have absolutely no business winning this kind of argument.

byuu · Post by **byuu** » Mon Jan 05, 2009 12:33 pm

Another WIP, but nothing visible to end users. Still get it if you don't have 07 for the nice speedup.

Mostly source-cleaning stuff.
- removed 'uint' type, replaced all instances with the proper unsigned int.
- removed as many headers as I could from the global interface.hpp file, including only in the cores that need each of them. Should help compile time. Though I still have a lot of global header includes due to needing ultra-hot sections of code inlined.
- added include protection bumpers to the CPU+SMP opcode core generated files
- added const-correctness to a few more classes.
- updated S-RTC and SPC7110 time to handle time_t overflow: it's now Y2K38 proof even on 32-bit signed time_t systems, and the file format remains unchanged. But it adds one limitation that you'll lose your time if you wait ~34 years before loading your last save game. I think that's reasonable for now. Once 64-bit time_t systems are ubiquitous, we should be able to trivially expand that without breaking old saves.

Relevant code (I tested with int16_t, uint16_t, int32_t, uint32_t, int64_t and uint64_t):

Code: Select all

  time_t diff
  = (current_time >= rtc_time)
  ? (current_time - rtc_time)
  : (std::numeric_limits<time_t>::max() - rtc_time + current_time + 1);  //compensate for overflow
  if(diff > std::numeric_limits<time_t>::max() / 2) diff = 0;            //compensate for underflow

Avoided the obvious (y-x)&<time_t>::max() just in case there's some crazy platform where the value != (some power of 2)-1. Modulus (max()+1) won't work there either, as it'll overflow if sizeof(unsigned) == sizeof(time_t). The +1 might throw it off by a second on one's complement system, but I don't really care :P

Anyone with GCC 4.3 want to try something for me? Try modifying src/lib/nall/platform.hpp and change #define alwaysinline __attribute__((always_inline)) to:

Code: Select all

#define alwaysinline __attribute__((always_inline)) __attribute__((hot))

... and let me know the FPS difference you get in some arbitrary game, please :D

It's supposed to be like manual-PGO.

ecst · Post by **ecst** » Mon Jan 05, 2009 4:04 pm

FitzRoy wrote:All you really need is two existing checksum methods. Fraud and collision is next to impossible with CRC+MD5 documentation. If by some miracle someone had the motivation and skill to brute force both, the file would be so butchered that it would never propagate in distribution channels. Because it wouldn't function in emulators.

FitzRoy wrote:Simultaneously spoofing 2 different algorithms (plus the internal one) without affecting program functionality is no cakewalk. Show me the money.

Somehow, this statement made me feel challenged. I wrote a small note explaining my thoughts on the subject. I would appreciate someone checking it and reporting typos, mistakes, or obscurities.

I kept the note general, not just down to SNES ROMs. The bottom line is, we can produce MD5 plus CRC32 collisions using only 4224 bytes. The rest of the ROM files could be chosen arbitrarily (even different from each other, under some heavily increased computational complexity, i. e. using one additional MD5 chosen-prefix attack).

Up to date, a general MD5 preimage/spoofing attack is not known. But if it were, it would immediately extend to a preimage attack for MD5 plus CRC32, similarly requiring only a few kilobytes to be set adequately, with the bulk of the ROM being chosen as to the attacker's likening. More directly said, the attacker could assemble a ROM completely as to his wishes, as similiar or dissimilar to the original as he wants, he only needs to reserve some few kilobytes (really not a problem) for the preimage attack.

Thus, MD5 plus CRC32 is not any more secure than MD5 alone.

Take in mind that I am by no means a professional cryptographer. These people (or some motivated hackers) might be able to find efficient attacks for combinations of real hash functions (not like CRC32). The thing is, you can never be sure that the security design you propose is secure, unless you really prove it (now that would be something innovative) or have some knowledgeable people check it (and even then you cannot guarantee that there can be no loop hole). Claiming security until being refuted is really the wrong way.

DataPath wrote: Well, usually when you combine hashing methods like this, shortcuts tend to open up.

Exactly.

DataPath wrote: Well, you hire a cryptographer who spends a couple of months and $100,000 of your money (good cryptographers don't work for free, ya know), and he comes up with a 5-order-of-magnitude improvement in finding MD5-CRC collisions. Hallelujah! You've got it down to 45 minutes to produce an MD5-CRC collision, and it only took you $100,000 worth of cryptographer time and a cluster of 100 PS3s.

Down to less than 1 minute actually. Shall I PM you my bank account details? Would not mind the PS3s, too.

FitzRoy wrote:First of all, I at least know that the whole is far greater than the sum of its parts.

In this case, the only thing you can really be sure of is that the sum is at least as great as the greater of both parts.

FitzRoy wrote:But it doesn't stop there as you and ecst erroneously keep suggesting. You'll then need to be able to do this while maintaining a good internal checksum and functionality of the game program.

Have not noticed me suggesting that. Anyway, functionality is not a problem, see above. And I have no doubt that MD5 plus CRC32 plus internal checksum is just as secure (or insecure) as MD5. In fact, not thinking any further about it, applying standard chaining methods, there is a reduction needing only 128k of ROM space to be chosen as computed. Let me know if you wish to know the technical details.

byuu · Post by **byuu** » Mon Jan 05, 2009 6:16 pm

I'd like to update the cheat code section at some point.

Nach was mentioning a while back how he wanted to do fancy things, and one of them I wanted to do as well. Sometimes you'll have two or more cheat codes to cause one effect. I'd like to be able to group these codes so that you only have to toggle one of them for the effect.

My idea is to show normal codes as now, but multi-codes in the listbox with "CODE-1ONE+..." or "<group>". Replace the code+description boxes with a popup box when you hit add code that has something like 3-8 lines for a list of codes. Only activate the "Ok" box when all the code boxes are either valid or blank. Then change toggle status to edit code. Maybe stick an enabled checkbox on that window for those who don't realize you can double click listbox entries to toggle status.

Quick VBA mock-up:

Any comments on this before I try it?

King Of Chaos · Post by **King Of Chaos** » Mon Jan 05, 2009 6:22 pm

byuu wrote:Sometimes you'll have two or more cheat codes to cause one effect. I'd like to be able to group these codes so that you only have to toggle one of them for the effect.

YES. This is one of the things I hate most, multi-codes that take up multiple lines. This would really come in handy. Byuu, I like your idea. Go for it and see what happens, I'll be happy to give feedback on it.

Also, the ability to edit cheats within the little editor would really be nice too (so I don't have to delete cheats to edit them only just to re-add them).

tetsuo55 · Post by **tetsuo55** » Mon Jan 05, 2009 6:43 pm

finally got around to testing Bsnes WIP8.

the good news is, unthrottled i get about 80fps average with the Chrono Trigger opening video.(low72, high95)
When sync to audio is enabled i get an average of 57, almost playable, enabling 1 frameskip is enough to keep it constantly at 60.
If i also enable sync to video framerate drops to 43ish.

Very close to full playability on an AthlonXP 2600 @ 2Ghz

King Of Chaos · Post by **King Of Chaos** » Mon Jan 05, 2009 6:54 pm

byuu wrote:Quick VBA mock-up:

Awesome. Exactly how it should be.

Another thing to eventually consider down is codes that have multiple values for a desired effect (like having a specific weapon, level, etc).

Example: This Gun Modifier code for Lethal Enforcers...

7E1FBC??

Where as the ?? can use any of the following values...

02 - Have Grenade Launcher
04 - Have Magnum
06 - Have Shotgun Shell
08 - Have Automatic
0A - Have Glock .45
0C - Have Uzi

One of the only emulators I've seen that addresses that 'issue' (thus avoiding putting in multiple codes for values) is Project64.

tetsuo55 · Post by **tetsuo55** » Mon Jan 05, 2009 7:07 pm

That project64 screenshot looks almost perfect as far as a cheat screen goes.

King Of Chaos · Post by **King Of Chaos** » Mon Jan 05, 2009 7:10 pm

tetsuo55 wrote:That project64 screenshot looks almost perfect as far as a cheat screen goes.

I made a mockup of it that's simple and clean looking (will edit the text for effect in a bit).

EDIT: Mockup updated to reflect SNES code.

byuu · Post by **byuu** » Mon Jan 05, 2009 7:21 pm

One of the only emulators I've seen that addresses that 'issue' (thus avoiding putting in multiple codes for values) is Project64.

... yeah. The fact that it took me like two weeks to figure out listboxes in GTK+ ... no way in hell I'm trying tree-view node listboxes with checkboxes on each entry. No way.

That also looks ridiculously complicated. Cool, but way too much stuff going on. Maybe there's a way to do something similar, but less complicated?

I can do the multi-line description thing no problem.

King Of Chaos · Post by **King Of Chaos** » Mon Jan 05, 2009 7:28 pm

Well, screw the tree-view listboxes, it's the actual cheat editor on the right side of it that's appealing (I guess I should of edited that out). There's an updated edited proof-of-concept image above.

tetsuo55 · Post by **tetsuo55** » Mon Jan 05, 2009 7:33 pm

Is bsnes aware of the fact that the original snes outputs an image with a luminance of 16..235 and a pc monitor uses 0..255?
If so how does it handle this?

byuu · Post by **byuu** » Mon Jan 05, 2009 8:18 pm

Where do you come up with this stuff? :P

blargg · Post by **blargg** » Mon Jan 05, 2009 8:44 pm

Sounds like he's been reading up on digital video encoding of luminance, which has some extra range for overshoot and reserved values, and made the (incorrect) leap that the SNES has a digital video interface. The answer is no, it doesn't. The hardware converts ideal RGB values to video signal(s), but most emulators stop at the ideal RGB values, so the video encoding is irrelevant.