Latch timing

Strictly for discussing ZSNES development and for submitting code. You can also join us on IRC at irc.libera.chat in #zsnes.
Please, no requests here.

Moderator: ZSNES Mods

Post Reply
byuu

Latch timing

Post by byuu »

I was reading over the NTSC specifications, looking for clues on the dots that are longer than others, and the lines with the extra dot, etc.
The more recent NTSC-II standard states that the refresh rate is actually 59.94hz, I think this will make a difference compared to using 60hz as a base for comparison.
So if there are 525-lines drawn 29.97 times a second, we see that 15734.25 scanlines are drawn each second.
An NTSC image has 487-dots of horizontal resolution. There's something called the kell factor that defines how fuzzy the dots are to the human eye. So the actual dot-width is multiplied by 0.7, 487*0.7 = 340.9 visible dots/scanline. However, since a TV has an aspect ratio of 4:3, we need to reduce the horizontal resolution by 25%. 340 * .75 = 255 visible pixels/scanline, the rest make up the horizontal blank period.
Now we know that the smallest unit of measurement for the video positioning is the latches, they fall somewhere between 340 and 341 dots/scanline.
My take on this is that the SNES takes the kell factor into effect before returning the dot position to the latches. I also believe that the SNES 'fakes' these positions. It's obvious that a true NTSC image does not begin at dot 0, 0. There is blanking on both the top and bottom, and left and right of the screen. The actual image displayed is centered in this field. I think the SNES compensates for this for ease of programming.
Now that we know the smallest unit of measurement for one horizontal line in dots is 340/341 dots, and for master cycles is 1360/1364, we can see that this is far lower than the horizontal scanline hz frequency.
As seen here:
15734.25hz / 340dots = 46.28hz/dot
15734.25hz / 341dots = 46.14hz/dot

46.28hz/dot * 340dots/scanline = 15735.20hz/scanline
46.14hz/dot * 341dots/scanline = 15733.74hz/scanline

The hz<->dot frequency is not accurate enough to perfectly compensate for the NTSC timing. Therefore, I believe that the extra dots we experience sometimes are a result of the SNES 'catching-up' for the missed hz from other lines.

I have not, however, been successful in coming up with a formula to determine which scanlines obtain the extra dots. I'm having a bit of trouble with the fact that I don't know exactly how many master cycles there really are per scanline, and I don't know how many hz = one master cycle.

I wanted to post this to see if anyone caught anything from the above, or could give me the hz/master cycle ratio (with as many decimal places as possible) so that I can throw some ideas around.

Edit: Also, given that the horizontal resolution is ~255 pixels, I believe the lost pixel from the accepted 256 pixels is pixel #0. It's been mentioned before that hblank is active here. This could be why. Keeping in mind that the kell factor is not -exactly- 0.7..., and that I rounded down on the aspect ratio (340.9 * .75 = 255.675)

Edit 2: Got my cycle timings, thanks ie.
master clock cycle = 21,477,270.00hz
dot clock cycle = 5,369,317.50hz
Will post again later, then...
byuu

Post by byuu »

Ok, taking into account the length of a dot (1.0 / 5369317.5), and the length of a scanline (1.0 / 15750.0), I wrote a loop that would output the number of dots for all 525 lines.
The test returns:
scanline 0 [ 0]: dot count = 341
scanline 1 [ 2]: dot count = 341
scanline 2 [ 4]: dot count = 341
scanline 3 [ 6]: dot count = 341
scanline 4 [ 8]: dot count = 341
scanline 5 [ 10]: dot count = 341
scanline 6 [ 12]: dot count = 341
scanline 7 [ 14]: dot count = 341
scanline 8 [ 16]: dot count = 341
scanline 9 [ 18]: dot count = 341
scanline 10 [ 20]: dot count = 340

...and repeats. The 341 dots matches what we know, as anomie says $143/$147 are 1.5dots long (how did you determine this, btw, anomie? just curious). But if this is adhering to NTSC timing, then every 11th scanline should be missing a dot, not just scanline 240.
The timing itself seems to be exactly 60hz, and not 59.94hz. Using 59.94hz will bump the dots per line to 342/343, so that's out. I have no idea how this works since the NTSC-color specification -everywhere- says it's 59.94hz.
Code is here : http://setsuna.the2d.com/files/ntsc_notes.zip

This also applies only to interlaced-mode. google isn't much help, but I hear that NTSC televisions actually have their own non-interlaced mode built in, and that this is how the SNES can get away with only drawing 262 scanlines per frame, and thus missing the extra scanline.
If we accept that, then each non-interlaced scanline would be 15,720hz long to attain 60hz.
anomie
Lurker
Posts: 151
Joined: Tue Dec 07, 2004 1:40 am

Post by anomie »

Statistical analysis. The idea is, if i latch semi-randomly several thousand times (i believe the total ended up over 14 million) then all dots of the same length should get about the same number of latches. If a dot is 1.5 times as long as the rest, it should tend to get 1.5 times the number of latches. And that's exactly what I found, dots $143 and $147 got about 1.5 times as many latches as the majority of the rest of the dots (the WRAM Refresh dots getting fewer or 0, of course).

As for all the timing... Remember too that the NTSC color subcarrier is about 3.58MHz, even though the SNES tries to output dots at 5.37MHz. I've never really been able to reconcile many of these exact NTSC timing numbers. For example, 1364 master cycles per scanline, 262 scanlines per frame, and 59.94 frames per second would give a master clock speed of 21420637 Hz. 21477000 Hz would require almost 60.1 FPS.
byuu

Post by byuu »

Remember too that the NTSC color subcarrier is about 3.58MHz, even though the SNES tries to output dots at 5.37MHz.
I don't believe the color subcarrier is related to the length of dots. Dots are just 487pixels * 0.7kell. e.g. pixels that are visibly different to the human eye.
The color subcarrier only occurs once every scanline. It's also called a color burst. Formula:
hfreq = 4.5 * 10^6 / 286 = 15,734.2657342657hz
color_sc = (13 * 7 * 5) * (hfreq / 2) = 3579545.45454545hz
I don't know where the 13*7*5(455) comes from.
4.5 * 10^6 / 286 is the audio subcarrier. One of the requisites of NTSC was to sync with audio.

The subcarrier is the cause of the vertical refresh dropping from 60hz to 59.94hz, formula:
vfreq = hfreq / (525 / 2) = 59.94005994005994hz
It caused interference with the audio signal, so they lowered the framerate to make the dot position non-stationary, so any interference would not always occur at the same location, and thus, wouldn't be as noticeable to the human eye.

Also I quote this: "In order to decode the color signal, a pure 3.58 Mhz subcarrier signal in the proper phase is needed. It is generated locally by the color decoder. At the end of every scan line, in a part of the waveform called the back porch, there are about eight cycles of the subcarrier used for synchronizing. So long as the local oscillator does not gain or lose more than a quarter of a cycle in 63.5 microseconds (one scan line time) these "color bursts" will keep the local oscillator in sync. and in phase."

Since cycles aren't given a timing, I assume he means 3.58mhz * ~8 every scanline is used for the color subcarrier. The back porch is just jargon for what we call the hblank.

With all this in mind, we at least have the true formulas for getting higher precision accuracy for the length of scanlines, vertical hold, and the color subcarrier. But I'm still unable to make a correlation between these numbers and the SNES clock speed of 21477270hz that will satisfy there being 340-341 dots per scanline, with 525 scanlines being drawn every 29.97 seconds.
byuu

Post by byuu »

I see. Our master cycle timing is wrong, it was rounded down.
The SNES clock rate is a multiple of the color burst rate.
hfreq = 4.5 * 10^6 / 286 = 15,734.2657342657hz
col_sub = (13 * 7 * 5) * (hfreq / 2) = 3579545.45454545hz
clock_hz = col_sub * 6;
So our SNES clock speed is actually 21477272.727273hz, not 21477270hz.
That's of course for master cycles. A cycle is the exact same length as a color burst, probably just a coincidence of the timing crystal used in the SNES.

I use the following algorithms:

Code: Select all

  clock_hz    = ((315.0 / 88.0) * 1000000.0) * 6.0;
  dot_hz      = clock_hz / 4.0;
  line_hz     = 4.5 * 1000000.0 / 286.0;
  vhold       = line_hz / 525.0;
  color_burst = 315.0 / 88.0;
  dot_count   = (487.0 * 0.7) * 525.0 * vhold;
To arrive at:
NTSC dots/line: 340.900000 (dot_count / line_hz)
SNES dots/line: 341.250000 (dot_hz / line_hz)

Yes, I realize my variables aren't very well named, thank you.
A difference exists of 0.35 dots/line. This is still -infinitely- easier to work with than the non-divisible (against the NTSC clock) cycle timings I was using before.

Here's my theory: we are -wrong- in our assumption that dot_hz = clock_hz / 4.
If we try the below:

Code: Select all

  color_burst = 315.0 / 88.0;
  clock_hz    = ((color_burst) * 1000000.0) * 6.0;
  line_hz     = 4.5 * 1000000.0 / 286.0;
  vhold       = line_hz / 525.0;
  dot_count   = (487.0 * 0.7) * 525.0 * vhold;
  dot_hz_div  = clock_hz / dot_count;
  dot_hz      = clock_hz / dot_hz_div;
We get the following information:
SNES clock_hz: 21477272.727273
NTSC clock_hz: 3.579545
NTSC dot_hz: 5363811.188811
SNES dot_hz: 5363811.188811
SNES dot_hz_div: 4.004107
NTSC dots/line: 340.900000
SNES dots/line: 340.900000

If we go by the deviance, we see that the 1/0.004107 = 243.48673 (rounded). This could end up being the reason why some dots end up being triggered more than others. Interlacing, and syncing information stored in each horizontal line of the display could be causes as to why these two dots change positions. There could be other anomalies in the dot counter, as well. We can easily compensate for this by making a 341-word lookup table that reflects the SNES' assumed dot position. The counters may just be dividing by 4, even though they're -really- running at clock / 4.004107.

As far as I'm concerned, using the timing method I've described above is far more accurate to NTSC timing, and about the best we're going to get unless someone can figure out what the SNES is doing at a hardware level between each cycle. Statistical analysis and educated guesses can tell us that some dots appear longer than others, but we can't explain why.
I say we implement both anyway. The timing should be -far- more than enough to be as close to the SNES as we can ever hope to achieve.
byuu

Post by byuu »

Also worth noting, I found the document where TRAC (most likely?) got the info on the missing dot on scanline 240:
21477272.72 / (261*1364 + (1360+1364)/2) ~= 60.0988

262 scanlines per frame, 1364 cycles per scanline (one scanline
goes between 1360 and 1364 each frame)
I take issue with this. This cannot possibly be correct. The SNES cannot run at 60.0988hz, while your TV runs at 59.94hz (rounded)!
The image would desync and go all to hell unless the SNES is locked somewhere between each and every scanline to wait until the difference is equalized.

It is, however, possible that the SNES renders pixels to the PPU at this speed (buffering them), and then keeps them there, where there is a separate timer that sends the pixels from the video out of the SNES in sync with NTSC.
Again, I don't think tests are really going to give us definitive proof on this, but I think we could safely go by TRAC's and anomie's notes on the longer dots and the missing dot on scanline 240, and simultaneously adhere to only actually drawing 59.94(rounded) frames per second.
Whether we go about it by faking the dot positions returned, or manually holding the SNES locked for the difference of 60.0988hz - 59.94hz (rounded), is up to the emu author. I'll probably go for the former, myself.

5PM: Also, I was thinking... if you want to make the dots = clock / 4, then the kell factor can be changed via: (487.0 * n) = 341.25, 341.25 / 487.0 = ~0.70071869. I don't think that's right though as that would result in 341/342 dots/line instead of 340/341, which is what we're seeing. The kell is pretty standard and accepted as 0.7, but is definitely not exact.
anomie
Lurker
Posts: 151
Joined: Tue Dec 07, 2004 1:40 am

Post by anomie »

Let's throw a few more numbers into the mix. I've run a test to measure the number of master cycles executed between NMIs, and a test to measure basically the length of a dot.

The frame length test (time between NMIs) records 0x70dd or 0x70de iterations of the fastrom 'INY' instruction (12 master cycles) per frame, and the NMI routine itself has 164 master cycles overhead. I've varied some of the parameters (the length of the NMI routine, slowrom instead of fastrom, etc) and the results multiply out the same. Counts of 357368 (1364*262) and 357364 (1364*262-4) master cycles per frame fit just right (and the alternating lengths fit another oddity I noticed in testing, BTW). If i turn on interlace, the extra-scanline frames get 0x714C, which fits with 358732 (1364*263) master cycles.

The dot length test has a sampling loop of

Code: Select all

lda $37
lda $3f
lda $3c
sta $80
Running with D=$2100 in fastrom, 72 master cycles. Except for WRAM refresh and samples including $143 and/or $147, every sample was 18 dots. WRAM refresh cycles were always 28 dots. $143/$147 samples were sometimes 11 dots (normally once per h-blank). Total dots averaged 340 dots per scanline.

Problems:
1. We still only have 524 scanlines per frame in non-interlace mode instead of 525. Where does that extra scanline go?
2. Back to the 357368/357364 cycles per frame... If we have 60/1.001 FPS, that's only ~21420539.46 Hz. We would need 358312.5 cycles per frame to make it work.

Now, an interesting observation. If we somehow add 1 master cycle per scanline (where might it be hiding though?), then interlace mode works out perfectly! 1365 cycles per scanline * 262.5 average scanlines per frame * (60/1.001) frames per second = 1.89e9/88 Hz!
byuu

Post by byuu »

Is it possible the SNES just locks the processor during the extra scanline? I mean, we don't "see" the WRAM refresh period in our tests at all. We only notice it because it's right in the middle. What if there is synchronization between scanlines, or between frames, or both?
I really can't believe that the SNES outputs frames faster than ((4.5 * 10^6) / 286) / 525 full-frames per second (29.97....)... and since the SNES is just running slightly above NTSC, this shouldn't be too hard to do from a hardware standpoint.

The thing that messes up my non-interlace mode calculations is the total lack of information on NTSC non-interlace mode. Maybe the timing is different for this mode (it certainly should be if it ignores one scanline each frame).

1365 * 262.5 * (60 / 1.001) works out great! Didn't realize the vhold was so easy to calculate. These NTSC numbers have a lot of different ways to solve for them...

Using this:

Code: Select all

  clock_hz    = ((315.0 / 88.0) * 1000000.0) * 6.0;
  dot_hz      = clock_hz / 4.0;
  line_hz     = 4.5 * 1000000.0 / 286.0;
(dot_hz / line_hz) = 341.25
341.25 * 4 = 1365
So then if the SNES did run every scanline at 1365 cycles/scanline, the dot division really would be 4 (which matches our tests). Interesting...
What if the SNES really does use 1365cycles/scanline, and the reason we can only read 0-339 is because two dots are 1.5x as long (0-340), and the extra .25dots is either done while the SNES is being held like WRAM refresh, or $213c divides by 4 and ignores the decimal places?
Line 240 could just be doing something right there, hence why we can never latch the last pixel. Grah >___<

Code: Select all

1365 * 525          = 716625
(1364 * 524) + 1360 = 716096
                    ~ 529 cycle difference
Hmm.... the only way to satisfy this equation:
(n * 524) + 1360 = 716625
(n * 524) = 715265
n = (715265 / 524) = ~1365.009542
So then:
((715265 / 524) * 524) + 1360 = 1365 * 525 = 716625
Perhaps every scanline really is 1365.01 cycles. So every hundredth frame, it gets ahead by one cycle. After a little over 500 cycles (524), it is now 5 cycles ahead. Which it could equal out by making one frame 1360.

There's also the following:
((1364 * t) * 524) + (1360 * t) = 716625

714736t + 1360t = 716625
716096t = 716625
t = 716625 / 716096
t = ~1.00073872...

---

Alright, this is how I'm going to implement this:
cycles/scanline * lines/frame * (fps / 1.001) = ((ntsc_color_burst) * 6) = snes_clock_rate
or
1365 * 525 * (30 / 1.001) = ((315 / 88) * 1000000) * 6 = 1.89e9/88hz;
All scanlines are going to have 1,365 cycles.
I will write a small routine to generate a 2 * 1365 word table (test for y=240, the rest would use the same table), that can index the current cycle to an x dot position. I'll let the table deal with the $143/$147//scanline 240 anomalies.
It won't be perfect to the real SNES, but it will be perfect in appearance to programs: no program could ever try and test for the extra dot on scanline 240 and deadlock if it exists (because it won't show up after the table conversion), and the actual FPS will be perfect to a real SNES/NTSC still. Any real cons to this method? A single master cycle isn't enough to transfer even a single byte via DMA/HDMA/opcode. And I would really -hope- that no game needed that kind of precision for something like HDMA transfers to $214x to keep the sound processor in sync.

I'm not sure what to do about non-interlace mode yet.
anomie
Lurker
Posts: 151
Joined: Tue Dec 07, 2004 1:40 am

Post by anomie »

byuusan wrote:Is it possible the SNES just locks the processor during the extra scanline? I mean, we don't "see" the WRAM refresh period in our tests at all. We only notice it because it's right in the middle. What if there is synchronization between scanlines, or between frames, or both?
There's not a whole scanline missing anywhere, unless the SPC700 gets stopped too (which would be difficult since it has its own oscillator and doesn't connect with the refresh line or anything).
The thing that messes up my non-interlace mode calculations is the total lack of information on NTSC non-interlace mode. Maybe the timing is different for this mode (it certainly should be if it ignores one scanline each frame).
Is there a NTSC non-interlace mode?
What if the SNES really does use 1365cycles/scanline, and the reason we can only read 0-339 is because two dots are 1.5x as long (0-340), and the extra .25dots is either done while the SNES is being held like WRAM refresh, or $213c divides by 4 and ignores the decimal places?
One dot must be 1 master cycle longer for 1365 to work with 340 dots. My guess is that one of the WRAM refresh dots is the long one, and WRAM refresh is then 41 master cycles long.

Line 240 would probably have 1361 instead of 1365 master cycles...
no program could ever try and test for the extra dot on scanline 240 and deadlock if it exists (because it won't show up after the table conversion)
But if the extra 4 cycles are present, it might be possible to desync far enough over the course of several frames and then latch an unexpected value... Not likely though.
A single master cycle isn't enough to transfer even a single byte via DMA/HDMA/opcode.
4 master cycles, two of which are enough for an extra byte.
byuu

Post by byuu »

I've read on two pages that non-interlaced NTSC is not part of the NTSC standard, but that it does in fact exist in nearly all modern televisions.
The SNES just widens the beam cannon to draw 'two' scanlines at once, but it doesn't cover both fully, only about 70%, which is why the screen almost looks like it has scanlines on a TV, but interlaced modes 5/6 do not.
* This is all assuming I understand what was being said correctly in the documents I read.
One dot must be 1 master cycle longer for 1365 to work with 340 dots. My guess is that one of the WRAM refresh dots is the long one, and WRAM refresh is then 41 master cycles long.
I like that theory. It would be difficult, but we could probably time an exact cycle count from the WRAM refresh if we ran enough tests similar to your lda $37/$3c/$3c/sta $80 loop and averaged the results.
I really wish we knew what WRAM refresh was... maybe it's really NTSC synchronization. That seems far more believable than WRAM that can't keep its state on its own without requiring refreshing that needs the CPU locked. I realize the /WRAM pin goes high during this period, but still...

Line 240 wouldn't be 1361, though. That would throw off the equation: (1365 * 525 * 30 / 1.001) = (315 / 88) * 1000000 * 6 = 1.89e9/88hz
There's something else odd about it. It's very suspicious that line 240 is the first vblank line (regardless of overscan). I think the 4 cycle loss there is probably related to this.

I was thinking that we could just do something like your test of executing iny repeatedly for an entire frame to get the exact # of cycles executed per frame with interlace on and off, and just spread out the difference from that and what it would be if all scanlines were 1365 over the 525 visible lines (or 524 lines for non-interlace) via locking the CPU during the extra time. That will satisfy NTSC timing, and SNES cycles per frame timing at the same time. Adding in a lookup buffer to the cycle/scanline will satisfy the dot anomalies as well as line 240. Wouldn't this be sufficient for 'perfect' emulation?
anomie
Lurker
Posts: 151
Joined: Tue Dec 07, 2004 1:40 am

Post by anomie »

byuusan wrote:I like that theory. It would be difficult, but we could probably time an exact cycle count from the WRAM refresh if we ran enough tests similar to your lda $37/$3c/$3c/sta $80 loop and averaged the results.
But what would we time against?
I really wish we knew what WRAM refresh was... maybe it's really NTSC synchronization. That seems far more believable than WRAM that can't keep its state on its own without requiring refreshing that needs the CPU locked. I realize the /WRAM pin goes high during this period, but still...
The PPU doesn't stop during this time, only the CPU (including DMA, etc).
Line 240 wouldn't be 1361, though. That would throw off the equation: (1365 * 525 * 30 / 1.001) = (315 / 88) * 1000000 * 6 = 1.89e9/88hz
The line 240 thing is only in SNES non-interlace mode, remember.
There's something else odd about it. It's very suspicious that line 240 is the first vblank line (regardless of overscan). I think the 4 cycle loss there is probably related to this.
Nothing terribly suspicious. It's the best choice for trying to alter the color synchronization, you have the whole V-Blank to adjust to the new phase.
and just spread out the difference from that and what it would be if all scanlines were 1365 over the 525 visible lines (or 524 lines for non-interlace) via locking the CPU during the extra time.
What about the APU? I haven't been able to find this missing scanline...
whicker
Trooper
Posts: 479
Joined: Sat Nov 27, 2004 4:33 am

Post by whicker »

byuusan wrote:I've read on two pages that non-interlaced NTSC is not part of the NTSC standard, but that it does in fact exist in nearly all modern televisions.
The SNES just widens the beam cannon to draw 'two' scanlines at once, but it doesn't cover both fully, only about 70%, which is why the screen almost looks like it has scanlines on a TV, but interlaced modes 5/6 do not.
* This is all assuming I understand what was being said correctly in the documents I read.
Granted, I'll admit to knowing next to nothing about the PPU, but maybe I can clarify something.

I'll start with a question: How does a run-of-the-mill interlaced television present in millions of homes know if it's supposed to be drawing an even or an odd frame?

If the even and odd lines switched, wouldn't you agree that it would be definitely noticable to anyone watching television? If when you flipped to a different channel, you had a 50% chance of the picture looking right?

So, given that, is there something in the signal that tells the TV that it should be drawing an even or odd frame...?

Well, there is.

But now wait a second... What if we told the TV to draw an odd frame, and then told the TV to draw an odd frame....

And then told the TV to draw an odd frame...


Aha, scanlines. Aha, a 60Hz screen update rate.


(I'm only explaining things this way so everybody can understand what I'm saying. I've left the technical details out).
anomie
Lurker
Posts: 151
Joined: Tue Dec 07, 2004 1:40 am

Post by anomie »

SPC700 timing experiments: All involve loading this program into the SCP700, then looking at the output of $2140:

Code: Select all

        .DB $e8, $00       ; 0200: MOV A,#$00
        .DB $8d, $00       ; 0202: MOV Y,#$00
        .DB $8f, $00, $01  ; 0204: MOV $00, $01
        .DB $8f, $01, $00  ; 0207: MOV $01, $00
        .DB $7a, $00       ; 020a: ADDW YA,$00
        .DB $da, $f4       ; 020c: MOVW $f4,YA
        .DB $5f, $0a, $02  ; 020e: JMP $020a
If I DMA 0x4000 words from $2140, I see a range of about 0x3e1. There are never any gaps in the numbering. Typically, it'll read 17 copies of one value before moving on to the next. 14/15 over WRAM Refresh, and occasionally a WRAM refresh will steal a copy from an adjacent 17. The pattern is mostly "15 17 17 17 17 14 17 17 17 17". I think the interruptions in the pattern can account for the 'missing' 4 cycles here.

If I read $2140 every NMI, the separation is 0x522 or 0x523 in non-interlace mode, and 0x527 or 0x528 for the long frames in interlace mode.
Overload
Hazed
Posts: 70
Joined: Sat Sep 18, 2004 12:47 am
Location: Australia
Contact:

Post by Overload »

I did a test a month or two ago to determine which horizontal counter values could not be latched and the results showed that only 8 pixels could not be latched (OPHCT: 135-142, $87 - $8E). I tested on both my PAL and NTSC systems.

Using DMA is a good way to check for irregularities, just DMA from $4212 to RAM.
anomie
Lurker
Posts: 151
Joined: Tue Dec 07, 2004 1:40 am

Post by anomie »

Overload wrote:I did a test a month or two ago to determine which horizontal counter values could not be latched and the results showed that only 8 pixels could not be latched (OPHCT: 135-142, $87 - $8E). I tested on both my PAL and NTSC systems.
Current theory is that the refresh waits until completion of the current instruction before pausing the CPU. And if you have a regular latch-and-poll loop you'll see that the cycle including the refresh period is 10 dots longer.

I've noticed that $85 and $90 are only latched 3/4 as often as most other dots, and $86 and $8f are latched only 1/4 as often (this was the same test showing $143 and $147 latched 6/4 as often as other dots).
byuu

Post by byuu »

But what would we time against?
Just the $213c latch. We'd need to get an emulator to begin NMI at the exact right cycle (we know its about 4/5.5dots in, but we'd need a perfect cycle position instead) and have perfect master cycle->opcode counts.
Then we could use a combination of opcodes that would be off-by-one cycle from a whole dot right after the refresh period. We could then change the refresh period in the emulator from 40 to 41 to see if it matches the results on an NTSC SNES, if done right, changing that one cycle would result in a different dot position in $213c. We could use one of the 'add one cycle if...' conditions to get the cycle position on an odd number.
But now wait a second... What if we told the TV to draw an odd frame, and then told the TV to draw an odd frame....

And then told the TV to draw an odd frame...

Aha, scanlines. Aha, a 60Hz screen update rate.
Can you go into more technical details if you have them, please?
Are you saying that non-interlace mode draws only odd frames? (that would explain why we have 262 scanlines/screen instead of 263 if it were drawing even ones)
And are you saying the refresh rate is 60hz, instead of 60/1.001hz (59.97....hz)?
I would think that the television could not adjust its built in hz rating, regardless of the picture being interlaced or not. So the problem we have is where that 0.03...hz goes each second.
byuu

Post by byuu »

On a side note: I stand corrected about WRAM refresh. The ~40 master cycle delay probably is just that.
Apparently, the D (dynamic) in DRAM means that it needs to be refreshed periodically.
http://en.wikipedia.org/wiki/DRAM
Just reading one byte in a row will refresh the entire row, and since memory accesses are 8 cycles, it is my theory that the refresh updates 5 rows per scanline, and the delay is 40 cycles, and not 41. This would ruin our 341.25 theory, though :(
Regardless of SNES terminology, I think WRAM should be referred to as DRAM, since that's what it is.
whicker
Trooper
Posts: 479
Joined: Sat Nov 27, 2004 4:33 am

Post by whicker »

Can you go into more technical details if you have them, please?
Are you saying that non-interlace mode draws only odd frames? (that would explain why we have 262 scanlines/screen instead of 263 if it were drawing even ones)
And are you saying the refresh rate is 60hz, instead of 60/1.001hz (59.97....hz)?
I would think that the television could not adjust its built in hz rating, regardless of the picture being interlaced or not. So the problem we have is where that 0.03...hz goes each second.
I'm positive that a "non-interlace" mode with the dark scanlines only does only one of the two fields. More correctly, the same field is scanned twice without inverting the color subcarrier. This gives a rock-solid, non-flickering stable picture that can change 59.94 times a second (or thereabout).

I've actually tried this with a VIC-20, where it was really easy to turn on and off interlace mode with a POKE in BASIC. Normally you've got black scanlines, but when you turn on interlace they disappear. Of course, then you've got the whole interlace ugliness when things move onscreen, so it's normally off.

"odd-field odd-field frames" do mess up the timing numbers, however. If you take a look at http://www.ntsc-tv.com/ntsc-main-02.htm the second field starts with the beam in the middle, and I have no idea how this gets compensated for in timing if the video chip is again trying to send an odd field. Maybe there's some way to get the beam to instead travel back to the odd-line starting position, or maybe one can just ignore that line? Edit: ugh, that has to be, at the end of the last line the beam is told to go back up to the odd line position instead of an even line?

I fricken hate how console video is such a "black art" where nobody writes anything coherent about how it really works; and how to, for example, make your own video output circuit with an FPGA and some analog components that you can connect to your television. (Yes, there are black and white examples. I'm talking about NTSC color)

I didn't really mean to say that the refresh rate was 60Hz. Yeah, it's 59.94. I'm pretty sure there's some leeway up or down in the frame rate that a TV will be able to lock to, but I have no clue how much.
anomie
Lurker
Posts: 151
Joined: Tue Dec 07, 2004 1:40 am

Post by anomie »

byuusan wrote:
But what would we time against?
Just the $213c latch.
If the refresh is 41 cycles, with 9 dots of 4 cycles and one of 5 cycles, that adds to exactly 10 dots. If the refresh is 40 cycles with all 10 dots 4 cycles, that adds to exactly 10 dots.
Then we could use a combination of opcodes that would be off-by-one cycle from a whole dot right after the refresh period.
Main CPU cycles are 6, 8, or 12 master cycles. All known dots are 4 or 6 master cycles. Any combination will always add to an even number of dots. If we have a refresh of 41 cycles with a scanline length of 1365, that's 1324 effective cycles. If it's 40 cycles with a scanline length of 1364, that's 1324 effective cycles. About all we could do is detect the case where it's 40 and 1365, and watch for the drift because of 1325 effective cycles over several scanlines. Best bet, IMO: set up an IRQ far away from NMI, fastrom, SEI and WAI for it, and LDA $37. Hope you latch a stable value (i.e. always latch line 100 dot 42). Then SEI and WAI again, but this time insert 1767 NOPs before the LDA $37. If I'm correct, 1324 would latch H+5 while 1325 would latch only H+1 (both 16 scanlines later, BTW).
We could use one of the 'add one cycle if...' conditions to get the cycle position on an odd number.
"add one [CPU] cycle if" adds 6 master cycles, or 1.5 normal dots.
Aha, a 60Hz screen update rate.
60+53880/983477 (about 60.054786), assuming the SNES runs at 1.89e9/88 Hz with 1365 cycles per scanline and those odd frames are all 262 scanlines long, and every other frame 4 cycles short. We have fairly conclusive results for 1365*262 and 1365*262-4, so it's either the master clock or the frame output rate that has to bend...

It's too bad we have no one who can hook up some device to just directly measure the framerate for us and settle this whole thing.
Reznor007
Lurker
Posts: 118
Joined: Fri Jul 30, 2004 8:11 am
Contact:

Post by Reznor007 »

http://unemulated.emuunlim.com

Email Guru and ask if he can measure a SNES. He has tons of equipment for hardware analysis and ROM dumping.

On a related note, I know that TV's are very lenient as to the input. I have a RGB->NTSC SVideo converter for using arcade game PCB's on a normal TV, and many of the games I have use very odd refresh rates, yet they work fine on TV's. Some example games are Mortal Kombat(uses a 53.204948Hz refresh), NFL Blitz (57Hz refresh), and Galaga (60.606060Hz refresh).
anomie
Lurker
Posts: 151
Joined: Tue Dec 07, 2004 1:40 am

Post by anomie »

anomie wrote:If I read $2140 every NMI, the separation is 0x522 or 0x523 in non-interlace mode, and 0x527 or 0x528 for the long frames in interlace mode.
More info: it's all right about 50-50, with maybe a slight bias one way or the other. However, when the SNES is 'cold' it's less stable and only gets to 50-50 once it has warmed up a bit.

And occasionally we get an outlier value, i've seen as high as 0x627 (well, once i saw 0xfe22, who knows where that came from) but normally it's just 1 or 2 off. But the outliers are neither regular nor frequent enough to be correcting for the frame rate.
byuu

Post by byuu »

I've been trying to get some timing results from my SNES today, but am having no luck.
I set FastROM mode, jump to $80xxxx, and set D to $2100. Disable NMI and IRQ, and then just - lda $4212 : bpl -, - lda $4212 : bmi -, lda $37.

I end up reading 000a on my SNES, and 0014 through emulation. Ignoring the difference, I try running 128 scanlines worth of NOPs:

128 * 1325 = 169600cycles

169600 / 12 = 14133.3

The results are on an SNES: 0031x, 0080y
Emulation: 000fx, 0080y
Difference from start: 0027x, 0080y
($27-$f)*4 = 96 cycle difference

So by executing 128 scanlines worth of code, I end up with 96 extra cycles, suggesting each scanline is .75 cycles too long, for an average of 1364.25cycles/scanline.
It's really hard to get exact numbers, since being off by even one instruction results in a deviance of at least 3 whole dots for the fastest instruction possible...

I've in fact spent an hour or two trying to just get the NMI latch to match the SNES, and it's -always- off.

Also worth noting is that somehow, the vblank flag is cleared faster than in emulation, where it's cleared at dot 0, 0. So it would almost have to be cleared on the last scanline of the screen...

An interesting thing to toy around with is $2133 bit 1. It will draw OAM sprites interlaced, even in modes 0-4/7. Or at least, they look identical onscreen to what BG tiles do on interlaced BG5/6. So if it's doing this, then it can't just be drawing odd scanlines only...
My TV is horrible and cuts off 10 pixels on the top and bottom, and 20 pixels on the left, resulting in 20 pixels of visible hblank on the right, but it may be interesting to try putting a 64x64 sprite at y: 250, and toggling $2133 bits 0 and 1, and seeing what happens at the top of the screen.
anomie
Lurker
Posts: 151
Joined: Tue Dec 07, 2004 1:40 am

Post by anomie »

anomie wrote:Best bet, IMO: set up an IRQ far away from NMI, fastrom, SEI and WAI for it, and LDA $37. Hope you latch a stable value (i.e. always latch line 100 dot 42). Then SEI and WAI again, but this time insert 1767 NOPs before the LDA $37. If I'm correct, 1324 would latch H+5 while 1325 would latch only H+1 (both 16 scanlines later, BTW).
IRQ set for $64,$64. No nops latches $6f,$64 or $70,$64. With nops latches $74,$74 or $75,$74. Unless I made a mistake somewhere, 1324 effective cycles per scanline is pretty well proven.
byuusan wrote:Disable NMI and IRQ, and then just - lda $4212 : bpl -, - lda $4212 : bmi -, lda $37.

I end up reading 000a on my SNES, and 0014 through emulation.
Are you really getting a constant 000a with that code?
An interesting thing to toy around with is $2133 bit 1. It will draw OAM sprites interlaced, even in modes 0-4/7. Or at least, they look identical onscreen to what BG tiles do on interlaced BG5/6. So if it's doing this, then it can't just be drawing odd scanlines only...
Hrm... I just ran a test, and it doesn't appear that $2133 bit 1 will activate the 'interlace mode' 263-scanline deal. And there is definately a difference between $2133=2 and $2133=3: for an OBJ with a "1/2"-pixel-high red line overlapping a BG with a 1-pixel-high white line, $2133=3 clearly shows 2 lines while $2133=2 looks like 1 line changing color each frame. So it very well could be constantly outputting 'odd' fields when $2133 bit 0 is clear.
On a related note, I know that TV's are very lenient as to the input. I have a RGB->NTSC SVideo converter for using arcade game PCB's on a normal TV, and many of the games I have use very odd refresh rates, yet they work fine on TV's.
Or is the converter just skipping or stuttering frames to bring it very close to 60/1.001 Hz?


On a side note of my own... On my crappy setup, I can actually SEE the refresh, and with the IRQ test i could SEE just where the WAI ended, and where the NOPs left off and it started polling $4212 for v-blank to update the numbers. Apparently something in there is picking up interference from the S-CPU and the different execution patterns cause different interference patterns on my screen...

Oh, and the outlier values i mentioned in the APU test: they're very probably the mis-read samples I get occasionally when I read $2140 in the middle of a write to $f4. I was tired last night...
byuu

Post by byuu »

Are you really getting a constant 000a with that code?
I'll recreate the test ROM and post it here tomorrow, but I'm pretty sure that I do. I have to reset once of course since the copier doesn't jump in at exactly the right point.
Hrm... I just ran a test, and it doesn't appear that $2133 bit 1 will activate the 'interlace mode' 263-scanline deal. And there is definately a difference between $2133=2 and $2133=3: for an OBJ with a "1/2"-pixel-high red line overlapping a BG with a 1-pixel-high white line, $2133=3 clearly shows 2 lines while $2133=2 looks like 1 line changing color each frame. So it very well could be constantly outputting 'odd' fields when $2133 bit 0 is clear.
Hmm... well, my TV -does- suck, but I tried it with a 16x16 sprite in mode1 and mode5 with interlace enabled, and it looked the same. The background was blue, and the sprite was red (A, B, C, and D were drawn on each 8x8 tile).
In modes 5/6, I get noticeable degradation of quality. The red sprite against the blue background almost turns into a brownish color. Hard to explain, but it's extremely hard to read the letters. Whereas I don't get that color degradation with mode 1, I was pretty sure I could make out all pixels on the actual 16x16 tile, even though it was crushed in half. If it swapped lines every frame, then I would think I could see it flickering like mad. It was a very solid image, though. It may have been cutting out every other line on the sprite, and I just didn't notice, but I don't think so... I'll throw some more tests at it.
anomie
Lurker
Posts: 151
Joined: Tue Dec 07, 2004 1:40 am

Post by anomie »

byuusan wrote:I'll recreate the test ROM and post it here tomorrow, but I'm pretty sure that I do. I have to reset once of course since the copier doesn't jump in at exactly the right point.
Run the test for multiple frames too, if you haven't been... I typically sample and write the value to the screen every frame so if the value tends to change I can see it well.
Hmm... well, my TV -does- suck, but I tried it with a 16x16 sprite in mode1 and mode5 with interlace enabled
Try it with only OBJ interlace enabled? $2133 = 2, not 3.
In modes 5/6, I get noticeable degradation of quality. The red sprite against the blue background almost turns into a brownish color. Hard to explain, but it's extremely hard to read the letters.
The problem is that the NTSC color carrier has a granulatiry of about 6 master cycles (at least when used for digital output, analog has better fading), but the SNES is trying to change the color every 2 master cycles in hires mode. So each pixel gets about 1/3 of the color it should...
Whereas I don't get that color degradation with mode 1, I was pretty sure I could make out all pixels on the actual 16x16 tile, even though it was crushed in half. If it swapped lines every frame, then I would think I could see it flickering like mad.
Well... When the SNES does this:

Code: Select all

FIELD 1e    FIELD 1o    FIELD 2e    FIELD 2o
AAAAAAAA                AAAAAAAA
            BBBBBBBB                BBBBBBBB
CCCCCCCC                CCCCCCCC
            DDDDDDDD                DDDDDDDD
the human eye tends to combine it into a full image (and phosphor persistance probably helps too). Interlace ($2133 bit 0 set) does this. But if you draw the As and Bs on the same line instead, the flickering may become noticable.
Post Reply