ZSNES 1.51 on Win32 patch to make Vista/7 friendly

Post by **odditude** » Wed Oct 21, 2009 9:41 pm

possible solution: on startup, check for presence of of .\bsnes.cfg. if not present, check for %appdata%\bsnes\bsnes.cfg. if not present, pop a dialog asking if bsnes should use shared settings for all users or independent settings for each user; generate bsnes.cfg in the proper location based on the user selection. this way, the user doesn't need to hunt either way.

kode54 · Post by **kode54** » Wed Oct 21, 2009 10:58 pm

Also, preferences dialog option to open the folder containing the configuration file, possibly highlighting it in the process. Foobar2000 does this already.

Although, good luck finding a cross-platform way of opening a folder with the default file manager.

gblues · Post by **gblues** » Fri Oct 23, 2009 5:51 pm

Well, since the paths are stored in the configuration files, I've found I will need to update the config file to use UNICODE format, and the file i/o routines accordingly. So I'm going to be delving into parsegen.cpp.

Can anyone shed light on what exactly char_array_pack() and char_array_unpack() are doing? It looks like char_array_pack() is run-length encoding any leading and terminal NUL characters, and I'm just curious why ZSNES does this rather than just shifting the contents to the start of the array and starting from 0?

Here's the relevant code, from psrtemp_cfg.c (generated from parsegen from cfg.psr):

Code: Select all

static char *char_array_pack(const char *str, size_t len)
{
  char packed[LINE_LENGTH];
  char *p = packed;
  while (len)
  {
    if (*str)
    {
      size_t length = strlen(str);
      strcpy(p, encode_string(str));
      str += length;
      len -= length;
      p += strlen(p);
    }
    else
    {
      size_t i = 0;
      while (!*str && len)
      {
        i++;
        str++;
        len--;
      }

      sprintf(p, "0%s", encode_string(base94_encode(i)));
      p += strlen(p);
    }
    *p++ = '\\';
  }
  p[-1] = 0;
  strcpy(line, packed);  return(line);
}

Post by **grinvader** » Fri Oct 23, 2009 6:09 pm

We encode bytestrings in base 94. If you can't see what it's good for, don't use the function.

byuu · Post by **byuu** » Fri Oct 23, 2009 8:12 pm

base94?! Wow, that's totally awesome!
I know first-hand how insane non-power-of-two base encoding / decoding can be. Having to add/subtract and keep track of carries and all. Yeah, easy to Nach, I'm sure :P

Post by **grinvader** » Fri Oct 23, 2009 9:46 pm

Easy to anyone who remembers what bases are, really.
Converting between power-of-2 bases allows shortcuts, but that doesn't make the general algo any harder.

gblues · Post by **gblues** » Fri Oct 23, 2009 11:38 pm

I understand what it's doing--base 94 is very efficient at encoding binary numbers into ASCII characters (94 printable characters)--but I wasn't understanding why the config loader was using it.

After analyzing my config file and the code, I've come to the conclusion that the code is simply doing more work than it needs to and it almost works by accident.

The config writing routine RLE-encodes the leading nulls in the string, adds a slash, then encodes the actual string, and then RLE encodes any remaining nulls in the allocated memory.

The config reading routine skips to the slash written by the writer, starts loading the data from the file into the start of the string, and then proceeds to load any remaining data from the file.

In other words, the leading nulls that the config writer saves to the file get completely discarded by the reader!

Edit: Ugh, switching the paths to wchar_t is a royal pain in the butt. I'm not coherent enough to describe everything, but it's a pain.

At least I have a test case. I created a user on my vista PC using japanese characters and confirmed that my custom zsnes croaks on it; so if I can make it not croak on the test case and still work on my normal profile, I'll be good to go.

Post by **grinvader** » Sat Oct 24, 2009 5:59 pm

and it almost works by accident.

Tsk tsk, what a wild accusation.

gblues wrote:I've come to the conclusion that the code is simply doing more work than it needs to

The downside of maintaining an old rusty dragon with fresh routines instead of wiping everything and restarting from scratch with clean code.
Feel free to pull the latter. ^^
The nulls are needed for empty slots in your quickload menu. You can't skip storing them. We could have spaces instead, but what would we use to know when a file is actually called " " ? No issue when you store nulls.
The staticity of the array was inherited from the older routine(s).
Changing the design is much, much more painful than having a few additional characters in the config. Plus Nach went all the way to add RLE to minimise the clutter further.

Good luck with your wchar stuffs.

Squall_Leonhart · Post by **Squall_Leonhart** » Sun Oct 25, 2009 4:33 am

gblues wrote:
Squall_Leonhart wrote:heres an idea... don't use program files for emulators.

omg! the logic is astounding.
Let's pretend you've got a windows system with a user account for you and your brother. You both want to use ZSNES. Where do you suggest putting it so that both you and your brother can play it?

Yeah. Program Files exists for a reason.

Further, what if you like to use the hq2x filter, but your brother prefers normal2x with scanlines? If you throw the 1.51 release into C:\Program Files (and rig it so C:\Program Files\ZSNES is writeable, which is a BAD idea), you and your brother are going to be clobbering each other's settings each time you load ZSNES.

My patch allows you to each have your own configuration so you can have your hq2x and your brother can have his scanlines and never the twain shall meet.

Since it also stores the SRM/ZST files in the user profile, you don't have to worry about your brother "accidently" saving over your games/save states.

If you want to go live in your own little planet where users run applications from their home directory and your PC is littered with 10 copies of ZSNES, well the official release is perfect for you. If you're like me and like to put applications where they belong and like them to work like they are supposed to, well the link to my custom build is a couple posts above.

Simple, %appdata%\zsnes\
will automatically write to the current users roaming profile.

byuu · Post by **byuu** » Sun Oct 25, 2009 7:03 pm

grinvader wrote:The nulls are needed for empty slots in your quickload menu. You can't skip storing them. We could have spaces instead, but what would we use to know when a file is actually called " " ? No issue when you store nulls.

Oh man, nulls would wreck my lousy C-style string parser, heh.
Could always store characters you won't see in filenames like \b, \t or /
Of course, you already have a parser and it works just fine, so ...

gblues · Post by **gblues** » Mon Oct 26, 2009 6:55 am

Squall_Leonhart wrote:Simple, %appdata%\zsnes\
will automatically write to the current users roaming profile.

Good job reading the thread.

The problems with using %appdata% have already been covered.

gblues · Post by **gblues** » Mon Oct 26, 2009 7:11 am

Just a brief status update.

I've completed my first pass into wchar_t territory. This basically boils down to changing the zsnes global path variables (ZCfgPath, ZStartPath, etc) to wchar_t, watching what breaks, and making changes/writing new routines as necessary; wash, rinse, repeat until the compiler errors/warnings go away.

On the plus side, I've succeeded in getting ZSNES to store its configuration files in the user profile directory, even with a profile containing non-ANSI characters.

I ended up writing my own wchar_t version of certain functions, which are undoubtedly buggy, evidenced by the fact that zsnes crashes as soon as you try to access the Open dialog.

I am in the process of debugging this, although this isn't always so straightforward. I love finding calls to malloc() hidden in obscure macros!

It's time for bed now; I'll update more once I've completed debugging.

kode54 · Post by **kode54** » Mon Oct 26, 2009 6:59 pm

You could have done some wchar_t <-> UTF-8 translation to keep the strings as 8-bit, but then it probably would have been harder to find the places where those strings are used.

funkyass · Post by **funkyass** » Mon Oct 26, 2009 7:14 pm

how well does the GUI handle non-ascii characters anyway?

gblues · Post by **gblues** » Mon Oct 26, 2009 7:22 pm

kode54 wrote:You could have done some wchar_t <-> UTF-8 translation to keep the strings as 8-bit, but then it probably would have been harder to find the places where those strings are used.

I chose to make everything wchar_t for a couple different reasons:

1) SHGetKnownFolderPath() resolves to a wchar_t, so using wchar_t is convenient
2) the code is littered with calls to functions like strrchr() that apparently can act unexpectedly with UTF-8 encoding (at least according to some resources I looked at). Switching strrchr() for wcsrchr() would keep the expected behavior intact.
3) I'm writing this for my convenience, so I'm not really concerned about breaking the DOS or Linux builds.

At this point I believe I have completed my work. The only thing that you can't do is customize the paths to non-ANSI paths or browse non-ANSI paths in the Load dialog. I might be able to do something about the latter, too.

If anyone would like to check it out, I've refreshed the zipfile from my earlier post. Here's the link again:

http://www.strong-consultants.com/zsnes2008.zip

Post by **grinvader** » Mon Oct 26, 2009 8:14 pm

funkyass wrote:how well does the GUI handle non-ascii characters anyway?

It doesn't.
There's a translation lookup table that matches bytes to 'letters' (actually any symbol that was needed, like the minimize/maximise buttons in the top right). Unmapped bytes are empty, so the render farts out and resumes the execloop (which jumps right back into the gui, since guiexit isn't set and some other stuff is).

As I already said, we're not using any standard ASCII.

kode54 · Post by **kode54** » Tue Oct 27, 2009 3:44 am

gblues wrote:1) SHGetKnownFolderPath() resolves to a wchar_t, so using wchar_t is convenient

Yes, but the point was that you can translate the wchar_t received from functions like that or _wfindfirst to UTF-8, then translate from UTF-8 back to wchar_t for functions like _wfopen.

gblues wrote:2) the code is littered with calls to functions like strrchr() that apparently can act unexpectedly with UTF-8 encoding (at least according to some resources I looked at). Switching strrchr() for wcsrchr() would keep the expected behavior intact.

strrchr will behave as long as you're feeding it lower ASCII characters, since no UTF-8 sequences can contain lower ASCII. If you actually need to search for UTF-8 characters, then you will need to use strstr or similar.

gblues wrote:3) I'm writing this for my convenience, so I'm not really concerned about breaking the DOS or Linux builds.

Ah, yes. Okay, do carry on, then. As long as the original code paths remain intact for the other platforms, it's fine. It's probably better to use wchar_t for Windows anyway, since NT uses that for Unicode everywhere.

gblues wrote:At this point I believe I have completed my work. The only thing that you can't do is customize the paths to non-ANSI paths or browse non-ANSI paths in the Load dialog. I might be able to do something about the latter, too.

And you can do something about the former by at least using UTF-8 strings for the custom paths in the configuration file. It would also help to denote a UTF-8 text file by inserting a UTF-8 BOM at the start.

gblues · Post by **gblues** » Tue Oct 27, 2009 2:09 pm

Well, if there's interest in a 1.52 or 1.6 release with my code, I can get started on un-breaking the other builds. I haven't yet done anything regarding the configuration file location discussed earlier in the thread, but that's a simple enough modification now that all the changes are in place. I would also probably split zpath.c into OS-specific files so it doesn't become a mess of #ifdefs.

If there's no interest in merging the changes with mainline code, I'm OK with being relegated to a customized build for people who want those features.

Squall_Leonhart · Post by **Squall_Leonhart** » Tue Oct 27, 2009 3:28 pm

gblues wrote:
Squall_Leonhart wrote:Simple, %appdata%\zsnes\
will automatically write to the current users roaming profile.
Good job reading the thread. The problems with using %appdata% have already been covered.

%appdata% is consistent regardless of language.

gblues · Post by **gblues** » Tue Oct 27, 2009 3:45 pm

Squall_Leonhart wrote:
gblues wrote:
Squall_Leonhart wrote:Simple, %appdata%\zsnes\
will automatically write to the current users roaming profile.
Good job reading the thread. The problems with using %appdata% have already been covered.
%appdata% is consistent regardless of language.

UTF-16 characters don't get translated into the %appdata% variable. This is why the first version of my modification crashed when launched by a user with UTF-16 characters in the username. I actually tested this--if resolving the environment variable had been sufficient, I wouldn't have bothered spending the last week hacking in wchar_t datatypes into the path resolution routines.

%appdata% is fine to use from the command prompt or for shortcuts, anything relying on Windows to do their path resolution for them; but for an application that is doing its own path resolution such as ZSNES, it is not appropriate.

Post by **grinvader** » Wed Oct 28, 2009 8:35 pm

You skipped the part where UTF-16 sucks tremendous amounts of various humiliatingly filthy animal genetalia.

byuu · Post by **byuu** » Wed Oct 28, 2009 8:40 pm

And guess what Qt uses! :D
Get to suffix every last QString with .toUtf8().constData();

UTF-32 I can at least understand.

UTF-16 is the biggest joke ever. It fails to offer O(1) indexing or O(1) size == # of character thanks to surrogate pairs (it's not all garbage either, many useful Chinese characters are surrogate pairs), and it fails to offer backward compatibility with existing char* strings that make up billions of lines of legacy codebases. It's quite literally the worst possible choice. UTF-64 would make more sense.

All because Microsoft can't do a damn thing the same way anyone else does it.

Thristian · Post by **Thristian** » Thu Oct 29, 2009 9:26 am

byuu wrote:All because Microsoft can't do a damn thing the same way anyone else does it.

Well, that's not entirely true - back in the early days of Unicode, nobody would have agreed to anything as space-wasteful as UTF-32, and everybody thought that for sure if they just squooshed together various redundant Han glyphs, there'd be more than enough room for everybody in a 16-bit charset. On the strength of that, they signed up all kinds of Important, Enterprise-Ready, Legacy-Free systems like Microsoft's in-development Win32 API, Sun Microsystems' Java and NeXT's NeXTStep OS.

It wasn't until Unicode 3.0 or maybe 3.1 that the Unicode Consortium had to shuffle its feet apologetically and admit that 16-bits wouldn't actually be enough after all, and all these shiny, clean, legacy-free systems would actually be eternally tainted by the horrible hack that is UTF-16.

It's kind of hilarious that the Unix ecosystem was prevented from jumping on the UCS-16 bandwagon by decades of legacy code, and hence was the only one to switch to the much-nicer UTF-8 when it was invented - everyone else was now stuck with UTF-16.

byuu · Post by **byuu** » Thu Oct 29, 2009 11:34 am

Yeah, I am actually familiar with the history. I just like to bash Microsoft, sorry.

Still, they should make their *A API variants take a UTF-8 codepage and backport that into libc, etc.

Regardless of what happened, their system is now the only major, modern desktop OS to not use UTF-8.

funkyass · Post by **funkyass** » Thu Oct 29, 2009 1:17 pm

backwards compatibility is the ass-raping monkey on MS's back.