It's more an educational interest than a requirement. It really would be a requirement if the pipeline was longer, though.
Basically, the S-CPU has two stages: a fetch and decode stage, and an execution stage.
Analogy: say you want to unload a truck and put items onto a shelf. You could spend 1 minute getting an item from the truck, and another minute putting that item on the appropriate shelf. With a pipeline, imagine you have two people. One to take an item off the truck and hand it to the other, who puts it on the shelf. This process takes 1 minute per item instead of 2 minutes per item.
Well, that's how the S-CPU works. Things like I/O cycles, IRQ tests one cycle before the end of the opcode, cli : rti tricks, etc etc suddenly make a lot more sense when you understand how this pipeline works.
An example of a linear process:
Code: Select all
inc $12
- read 0xd6
- read 0x12
- read [0x12]
- i/o
- increment read value
- IRQ test
- write [0x12]
cli
- read 0x58
- IRQ test
- i/o
- p.i = 0
Code: Select all
Legend:
/ work cycle
\ bus cycle
(both happen at the exact same time)
/ <empty>
\ read 0xd6
/ idle
\ read 0x12
/ idle
\ read [0x12]
/ increment read value
\ i/o
/ idle
\ write [0x12]
- IRQ test
/ idle
\ read 0x58
/ idle
\ i/o
- IRQ test
/ p.i = 0
\ read next opcode
Now, the only problem is that I have no idea how to implement such a multi-tasking process efficiently in a single-threaded environment in C++ :/
Again, the simulation used currently with last_cycle() works just fine and covers every known edge case. This is just an academic interest.