Fast bit mixing algorithm for $2100 brightness emulation

byuu · Post by **byuu** » Thu Aug 10, 2006 10:28 pm

Ok, $2100 INIDISP SNES register. Low four bits control screen brightness. 0 = black, 15 = normal, 1 - 14 is a percentage brightness of the screen.
You would run each frame from 15 to 0 to perform a fadeout, for example.

Right now, I emulate this using a 1MB lookup table. I'm thinking that's probably bad for cache, and a waste of memory, so I want to convert this into fast math runtime versions instead. Below is what I have so far.

Note that the below code only adjusts one pixel. In the actual emulator, I plan to only cast ltable / use the switch statement once, and implement the actual for loop through all of the pixels inside of each switch.
This is just for an example :)

Code: Select all

  r = bgr555_color;
#ifdef LUT
extern light_table[16][32768 * sizeof(uint16)];
uint16 *ltable = light_table[b];
  r = ltable[r];
#else
  #define p1 ((r & 0x4210) >> 4)
  #define p2 ((r & 0x6318) >> 3)
  #define p4 ((r & 0x739c) >> 2)
  #define p8 ((r & 0x7bde) >> 1)

  //0 10000 10000 10000 = 4210
  //0 11000 11000 11000 = 6318
  //0 11100 11100 11100 = 739c
  //0 11110 11110 11110 = 7bde

  switch(b) {
  case  0: r = 0;            break;
  case  1: r = p1;           break;
  case  2: r = p2;           break;
  case  3: r = p2 + p1;      break;
  case  4: r = p4;           break;
  case  5: r = p4 + p1;      break;
  case  6: r = p4 + p2;      break;
  case  7: r = p4 + p2 + p1; break;
  case  8: r = p8;           break;
  case  9: r = p8 + p1;      break;
  case 10: r = p8 + p2;      break;
  case 11: r = p8 + p2 + p1; break;
  case 12: r = p8 + p4;      break;
  case 13: r = p8 + p4 + p1; break;
  case 14: r = p8 + p4 + p2; break;
  default: break;
  }

  #undef p1
  #undef p2
  #undef p4
  #undef p8
#endif

Now then, 15 is the most used, 0 the next often most used. 1-14 are really only used during fading, or maybe for pause screens. I believe 0 and 15 are ideally fast enough (and way faster than the lookup table), with only a single assignment operation.
I also believe the rest are fairly quick. The only ones I am really concerned about are 7, 11, 13 and 14. Those all require 3 ands, 3 shifts, and two adds each, plus the end assignment operator.
I'm specifically looking for improved algorithms for these. But perhaps 2, 5, 6, 9, 10 and 12 could be sped up as well somehow?
Unforunately, I'm only aware of how to do simplistic "divide by powers of two" tricks, so I'm pretty limited on what I can improve here :/

Also, one problem is that with only 0-15, we're technically missing one step. eg
0 = 0% brightness
8 = 50% brightness
12 = 75% brightness
15 = 100% brightness

With my lookup table, that's much easier to account for. I can just use 1 / 15 for steps instead of 1 / 16. That won't be quite as easy for this.
So... perhaps the lookup table really is the best solution for this puzzle? Or maybe split the table in two, increase the math requirements but decrease mem / cache requirements?
ex: p = light_table[p >> 10] | light_table_gr[p & 0x3ff];

blargg · Post by **blargg** » Fri Aug 11, 2006 2:35 am

First off, is the code you show doing the correct thing for values 1-14? I ask because 14/15 = 93.3%, while your code calculates roughly 87.5%. Each fractional component you calculate (1/2, 1/4, 1/8, 1/16) is rounded down, so adding all these together increases the error component. How costly is a simple multiply of the components (with guard bits inserted), with separate code to handle brightness = 15 and possibly 0?

Code: Select all

const int factor = (brightness << 5) / 15;

int t = (rgb << 16 & 0x03E00000) | (rgb & ~0x03E0);
t = t * factor;
rgb = (t >> (16 + 5) & 0x03E0) | (t >> 5 & ~0x03E0);

I haven't looked at the bsnes source much, but are you aware of the insanely useful aspect of constant integer parameters to a function template? For example:

Code: Select all

template<bool general>
void process_pixels( ... )
{
	... lots of loop handling
	if ( general )
	{
		t = (rgb << 16 ... etc
	}
	else
	{
		rgb = rgb >> 1 & 0x3DEF;
	}
	... more loop handling
}

if ( brightness == 7 )
	process_pixels<false>( ... )
else
	process_pixels<true>( ... )

This kind of thing makes special cases easy to add without duplicating a lot of code or using macros.

creaothceann · Post by **creaothceann** » Fri Aug 11, 2006 7:14 am

Are these brightness levels confirmed on the hardware? It'd be easy to set backdrop & palette to white, or maybe to some color stripes, and change Brightness via HDMA.

Charles MacDonald mentioned in snestech.txt several levels of "black". Maybe the SNES just does "RGB * ((Brightness + 1) / 16)" or something similar.

Post by **grinvader** » Fri Aug 11, 2006 2:49 pm

From IRC:

Code: Select all

<byuu>  fascinating
<byuu>  $2100 = 0 still shows a picture
<byuu>  but its soooooooooooooooooooooooooooooooooooo dark
<byuu>  i have to max out brightness + contrast to see a shadow of it

So I'm gonna assume it's indeed (brightness+1)/16, in which case:

Code: Select all

  r = bgr555_color;
#ifdef LUT
extern light_table[16][32768 * sizeof(uint16)];
uint16 *ltable = light_table[b];
  r = ltable[r];
#else
  if (b!=15) // speedhack lol
  r = adjust_bright(b+1); // 0 = 6.25%, 15 = 100%
#endif

#ifndef LUT
inline uint16 adjust_bright(uint8 b) {
  return (((((bgr555_color&0x7c1f)*b)>>4)&0x7c1f)+((((bgr555_color&0x3e0)*b)>>4)&0x3e0));
}
#endif

More precise than the mask+additions, and also doesn't take the room of a LUT.
You want to increment 'b' out of the loop for multiple pixels for maximum speed, kinda like:

Code: Select all

if (b!=15) {
  b++;
  for (<loop stuff>)pixel.r = adjust_bright(b);
}

Or something.

byuu · Post by **byuu** » Fri Aug 11, 2006 3:33 pm

The problem is that $2100 = 0 shows a really dark picture. It is definitely not an even scale downwards.

I had to turn brightness up to its absolute limit on my TV to just barely see the outline of letters from my test program on the screen. From 0 to 1 causes a huge jump in brightness, and then it's pretty much a smooth gradual increase. Honestly, with only 15-bits of precision, we're still lacking some of the smoothness of the gradient fade as compared to the TV. For example, my test had a greenish background and every other mode has kind of a different hue to the image. Almost seems like the SNES is using YUV/YIQ adjustments at the very end of the renderer, instead of just RGB adjustments.

WolfWings · Post by **WolfWings** » Fri Aug 11, 2006 3:55 pm

byuu wrote:The problem is that $2100 = 0 shows a really dark picture. It is definitely not an even scale downwards.

I had to turn brightness up to its absolute limit on my TV to just barely see the outline of letters from my test program on the screen. From 0 to 1 causes a huge jump in brightness, and then it's pretty much a smooth gradual increase. Honestly, with only 15-bits of precision, we're still lacking some of the smoothness of the gradient fade as compared to the TV. For example, my test had a greenish background and every other mode has kind of a different hue to the image. Almost seems like the SNES is using YUV/YIQ adjustments at the very end of the renderer, instead of just RGB adjustments.

I have to ask... did you test with a direct RGB/VGA hookup from the test-SNES (and if it's a new-style SNES, go to http://www.gamesx.com/rgbadd/snes2rgb.htm for the instruction on how to re-enable the RGB/VGA output), or was this an S-Video/Composite signal you were looking at? At very dark values, you might be running into encoding errors on the chroma encoder chip. Remember, many TV's and video-game encoding chips used to be designed to treat anything less than an overall value of 16 as 0, and anything above 239 as 255. That means that if the math was a simple ((x+1)/16) at the lowest level you could be running into the chroma encoders scaling+clipping.

byuu · Post by **byuu** » Fri Aug 11, 2006 4:24 pm

I do not have an RGB-modded SNES. Nor can I find anything that will display a SCART signal. Nor can I mod an SP/DIF out. These things would be immensely useful to me, but I have no soldering skills whatsoever, so it isn't going to happen, sadly.

Anyway, if the encoders were scaling / clipping at the outermost edges, then that doesn't explain why the only way I can see vaguely see the pure white text at $2100=0 is with maximum screen brightness. In order for that to work, it would have to be sending the smallest possible luminance for said white pixels. Otherwise if it were clipping it downwards to zero, then the screen would still be black, despite the ultra-high brightness setting on my TV, right?

By the way, for what it's worth... obviously if I set bit 7 of $2100 (display disable), the screen truly goes blank. However, there's still a grayish hue from the insanely high brightness of the TV at the time. I can definitely tell there's a difference between B=0,D7=0 and B=0,D7=1.

WolfWings · Post by **WolfWings** » Fri Aug 11, 2006 4:39 pm

byuu wrote:I do not have an RGB-modded SNES. Nor can I find anything that will display a SCART signal. Nor can I mod an SP/DIF out. These things would be immensely useful to me, but I have no soldering skills whatsoever, so it isn't going to happen, sadly.

As long as you have an original-era SNES, the only thing you need is a custom cable. The RGB signal is present on the pins on the back. They disabled the RBG and S-Video signals on the port on the back of the SNES in the second-gen models, but the first-gen models don't need any sort of mods to enable RGB. In fact, the NTSC/US models require a set of 220uf capacitors on the R, G, and B signal lines to bring the signals they output back to proper VGA spec, the PAL models (which I'm guessing you have) can be wired directly into a VGA port.

So if you have a spare SNES video cable or equivilant laying around, you can splice it into a cheap VGA cable. No soldering needed, just some care and electrical tape.

Pinout is here if you're curious.

byuu wrote:Anyway, if the encoders were scaling / clipping at the outermost edges, then that doesn't explain why the only way I can see vaguely see the pure white text at $2100=0 is with maximum screen brightness. In order for that to work, it would have to be sending the smallest possible luminance for said white pixels. Otherwise if it were clipping it downwards to zero, then the screen would still be black, despite the ultra-high brightness setting on my TV, right?

Actually if the encoders are scaling+clipping, it may be doing so in such a way that the ultra-bright 'white' text is ending up just outside that clipping range at b=0 if the math is ((b+1)/16) which would reduce white to 16, then the chroma encoder could reduce 16 to 1. I.E. (((16-15)*255)/223) would still show a difference between 16 and 0, but the difference would now be 1 and 0 in effect.

byuu · Post by **byuu** » Fri Aug 11, 2006 6:43 pm

I use NTSC and have two first-generation Japanese SNES units, the oldest model, and a slightly-newer-but-not-quite-SNES2-yet model.

The only cable for sale is a Multi-out SCART adapter. If it were just a matter of attaching three 220uf capacitors and some fancy rewiring, I'm certain there would be multi-out to VGA adapters on the market, eg at Lik-Sang.

I'm pretty sure you'd need frequency signal encoding hardware and all kinds of other fun stuff to get that to work.

I guess I could have fun blowing stuff up and chop apart a multi-out and VGA extender cable and tape stuff together to see what I get, but I really doubt it'd do any good.

As for the brightness, I don't know... it'd be really nice if we could definitively determine whether the adjustment applies to the RGB color or the YUV/YIQ color. It's most likely RGB, but it was such a huge jump from B=1 to B=0, that I don't see how that same effect could be accomplished with RGB adjustments. Not to mention the gradient fade is way the hell smoother than 15-bit RGB adjustments, even taking floating-point rounding calculations into account. I think I would need RGB888 to smoothly blend as well as the real hardware.

WolfWings · Post by **WolfWings** » Sat Aug 12, 2006 5:19 am

Um... there are SNES->SCART RGB cables available from Lik-Sang. For $3.

http://www.lik-sang.com/info.php?catego ... cts_id=220

WolfWings · Post by **WolfWings** » Sat Aug 12, 2006 5:27 am

Oh, and a final reason you have to turn the brightness up all the way to see 'bright white' at b=0 state: The gamma of the TV. TV's have VERY bad gamma curves, so very dark colours (like 1 versus 0) can require cranking the brightness up enough to get out of the 'black void' state. In fact lots of 'moody' games I turn the brightness all the way up, then adjust the contrast to fit the game, then lower the brightness until I'm comfortable.

byuu · Post by **byuu** » Sat Aug 12, 2006 6:11 am

Um... there are SNES->SCART RGB cables available from Lik-Sang.

...that's what I said :/

"The only cable for sale is a Multi-out SCART adapter."

Again, I live in the US, so I can't get a SCART monitor to use this cable on. I couldn't even find a SCART->VGA adapter for less than several hundred dollars, or I'd even be willing to use one of those.

Anyway, yeah. The gamma curve could be the problem with the TV. Now the question is, should I risk getting lynched and allow b=0 to show an image, or just cater to the norm and show b=0 as solid black?
Since the lowest I can set b=0 is 1 (which in RGB888 becomes #080808 unless I accept a massive speed hit for RGB888 mixing), and along with the absents of a TV-style gamma curve in most emulators, it will be much brighter than a true TV in some cases...

FitzRoy · Post by **FitzRoy** » Sat Aug 12, 2006 10:02 am

Remembered an old thread which may or may not be useful right now:

http://board.zsnes.com/phpBB2/viewtopic.php?t=5405

Here's a scart to vga adapter for $50. Or do you mean 15 pin vga?

http://www.gefen.com/kvm/product.jsp?prod_id=3257

Also, PM me your address byuu and I'll send you the digisnes when I get the chance. Seriously, it's just sitting here. I don't even have any games anymore. Just tell me, is optical ok or do you need coax?

creaothceann · Post by **creaothceann** » Sat Aug 12, 2006 10:06 am

Maybe someone with a capture card and a copier could run a modified version of the "32K color demo" and post the results here?

tetsuo55 · Post by **tetsuo55** » Sat Aug 12, 2006 7:38 pm

i would but my capture card sucks bigtime, + it doesnt accept rgb scart, which i believe none of the existing capture cards do

d4s · Post by **d4s** » Wed Aug 16, 2006 12:55 pm

creaothceann wrote:Are these brightness levels confirmed on the hardware? It'd be easy to set backdrop & palette to white, or maybe to some color stripes, and change Brightness via HDMA.

quick sidenote:
there seems to be a hardware problem with the sprites when writing to $2100 via hdma. atm, this is just a wild guess and i'll have to do some more testing as to why and when this is caused.

i stumbled upon this when i was hacking a pseudo-3d linescrolling effect into the overworld of breath of fire 2 last week.
because the cgadsub stuff was already in use most of the time, i had to resort to do a gradient via hdma and register $2100.

when i was testing it on the real hardware, i noticed that every time a sprite was overlapping with a scanline the hdma write to $2100 occurs on, one line of the sprite was shown again on the scanline below, but with the wrong horizontal position.

it also happens in the very first level of the kirbys dreamland remake of kirby super deluxe/kirby superstar/kirbys funpak.

i only tested it on 2 pal units so far, will probably investigate it in the next weeks if i have time and do a comprehensive test on all ppu/cpu revisions later.

byuu · Post by **byuu** » Wed Aug 16, 2006 3:28 pm

Perhaps that's the trick to Uniracers. It sets screen brightness to zero right before writing to OAM.

It's still possible that's caused by standard mid-frame OAM address invalidation, but it'd be an interesting test to block the $2100 write and see what happens on real hardware.

creaothceann · Post by **creaothceann** » Wed Aug 16, 2006 4:23 pm

Cool.

In OAM, the first byte of each sprite's data controls the horz. position... I'd guess that this value gets overwritten by the value send to $2100.

Or maybe Nintendo thought that programmers would change $2100 only in VBlank, and setting $2100.7 resets some internal data before the sprites of the next line are loaded.

Lord Nightmare · Post by **Lord Nightmare** » Tue Oct 10, 2006 11:52 pm

Sorry for necroposting:
d4s wrote:
>when i was testing it on the real hardware, i noticed that every time a sprite was overlapping
>with a scanline the hdma write to $2100 occurs on, one line of the sprite was shown again on
>the scanline below, but with the wrong horizontal position.

This occurs in kirby superstar too, in certain parts of the game (notably in the basement, down the elevators of the huge castle in "the great cave crusade") when you have a partner (particularly evident with the partner "bonkers" because his sprite is so huge)

a big 'blot' of garbage sprite data appears on the lines matching/below the sprite but at a horizontal position near the left of the screen. This is on an NTSC 1/3/2 SNES, running via s-video.

my *guess* would be that its a mask bug in whichever ppu handles sprites, in that writes to $2100 are trashing the 'cached' value (in the sprite generator) of the sprite horizontal
position, possibly only the high bit.

It may actually be fixed in the SNES2/Newstyle SNES units, which have newer ppu revisions.

Lord Nightmare