  f-cpu/c/scheduler/readme.txt
  info / personal notes about the f-cpu scheduler.
  Jaap Stolk (JWS) jwstolk@yahoo.com
  version:
Sun Jul 21 21:31:39 CEST 2002 JWS: created.


a normal instruction look like this:

 - fetch
 - decode      +   register-read
 - x-bar-read  +     scheduler
 - EU
 - x-bar-write
 - register-write

now this all work ok, until ther is a pipeline stall:

 - fetch
 - decode      +   register-read
 - x-bar-read  +     scheduler-> stall detected !
 - EU
 - x-bar-write
 - register-write

in this case the following hapends:
- the fetch  is stalled
- the decode is stalled
- the register read is stalled

- the register-read can be stalled. if we read the old
  register value we will bypass it.  

- x-bar-read  must be re-run every cycle, becouse if
  a bypass is needed, it the xbar needs get its data
  from somwere else.

- the scheduler must be re-run every cycle, to work out
  if the pipeline is still stalled.

- the check if a bypass is needed bust be done BEFORE
  the xbar stage (in the decode stage), but it can't
  be done in the decode unit, as it is stalled !!
  (thats the mistake i made)
  --> the bypass check must be done by the scheduler,
  it must look one cycle in advance, i.e. 
               

- the register-write is controled by the scheduler, and
  may never stall !!, so the scheduler can never be stalled.

-the execution units can not be stalled, but i implemented
 an x-bar-stage for every EU, that must be stalled!
 (it basicaly serves as a 1 cycle buffer to full the time
 between the decoder en the EU stage, in the hardware it
 is also used for fanout of the control signals, and move
 the control signals across the chip)


after fetch we can already dertermine if the 3 read registers are free or need to be bypassed
we can oly detect a write bus stall after the decoder.

concl:

check bypass in scheduler unit !!!
--> however this will result in very confusing "register nr's" and "next register nr's" !!
better to make a seperate unit :
 RBD register bypass detection (or something similar.)

i saved it in the scheduler directory for now.

how can we maximize speed for the simulator ? (don't nun if stalled !)
how will this be done in hardware ?, we can't seperate the register read and write units !!
concl: check for stalled in part of the unit ??

if we can correctly stal the units, we could easily remove the copying stage !?



bypass if:

the register is needed ?
if it is flags in the fetcher and we do a bypass, be could distroi thinks like imm !!!
















