Bit-Slice Design: Controllers and ALUs

Simple Controller continued

Last Edit April 3, 1997; May 1, 1999; July 9, 2001

Improved Architecture

The CCU is improved by placing the next-address MUX to input directly into the PROM to avoid the counter set-up time. The counter then becomes one of three inputs into the next-address MUX. The condition select MUX must be replaced by equivalent logic to generate the two MUX select signals of the new start address MUX.

The counter has moved to a position where it cannot receive a proper input. It must be replaced with a register, called the microprogram counter (µPC). The incrementer is connected to the PROM memory input and outputs to the µPC. The incrementer always contains the address being fetched plus 1. The outputs of the incrementer are gated into the µPC on the rising edge of the clock. The resulting configuration is shown in Figure 2-22.

Figure 2-22 Completed Elementary System

No Branch

The timing diagram for Figure 2-22 for no-branch execution is shown in Figture 2-23. What exists now is a three-level pipeline. During the first microcycle, the memory is fetching microinstruction i. The address of microinstruction i is in the µPC. The incrementer is one instruction ahead, with the address of microinstruction i + 1. The pipeline register contains microinstruction i - 1, which is in execution. The ACC contains the results of microinstruction i - 2. The execution proceeds as in earlier diagrams.

Figure 2-23 Microcycle Timing for the System in Figure 2-22, no branch

CLOCK
(each col = one µcycle) cycle cycle cycle cycle cycle

Incrementer µ-inst i+1
ADR µ-inst i+2
ADR µ-inst i+3
ADR µ-inst i+4
ADR µ-inst i+5
ADR

µPC Register µ-inst i
ADR µ-inst i+1
ADR µ-inst i+2
ADR µ-inst i+3
ADR µ-inst i+4
ADR

Memory FETCH
µ-inst i FETCH
µ-inst i+1 FETCH
µ-inst i+2 FETCH
µ-inst i+3 FETCH
µ-inst i+4

Pipeline Register µ-inst i-1 µ-inst i µ-inst i+1 µ-inst i+2 µ-inst i+3

ALU EXECUTE
µ-inst i-1 EXECUTE
µ-inst i EXECUTE
µ-inst i+1 EXECUTE
µ-inst i+2 EXECUTE
µ-inst i+3

Accumulator Result of
µ-inst i-2 Result of
µ-inst i-1 Result of
µ-inst i Result of
µ-inst i+1 Result of
µ-inst i+2

Improved Branching

The difference between the dersign is shown in the activity which occurs when a branch is executed, as shown in Figure 2-24.

Figure 2-24 Microcycle Timing for the System in Figure 2-22, Branch on Result of Microinstruction I (µ-INST. i)

CLOCK
(each col = one µcycle) cycle cycle cycle cycle cycle

Incrementer µ-inst i+1
ADR µ-inst i+2
ADR µ-inst b+1
ADR µ-inst b+2
ADR µ-inst b+3
ADR

µPC Register µ-inst i
ADR µ-inst i+1
ADR µ-inst i+2
ADR µ-inst b+1
ADR µ-inst b+2
ADR

Memory FETCH
µ-inst i FETCH
µ-inst i+1 FETCH
µ-inst b FETCH
µ-inst b+1 FETCH
µ-inst b+2

Pipeline Register µ-inst i-1 µ-inst i µ-inst i+1 µ-inst b µ-inst b+1

ALU EXECUTE
µ-inst i-1 EXECUTE
µ-inst i EXECUTE
µ-inst i+1
Cond
BRANCH) EXECUTE
µ-inst b EXECUTE
µ-inst b+1

Accumulator Result of
µ-inst i-2 Result of
µ-inst i-1 Result of
µ-inst i Result of
µ-inst i+1 Result of
µ-inst b

On the second clock, the memory is fetching microinstruction i + 1, and the address of microinstruction i + 1 is in the µPC. The address of microinstruction i + 2 is in the incrementer. Microinstruction i is in the pipeline register and is being executed by the ALU. The result of microinstruction i - 1 is in the ACC.

On the next clock, microinstruction i + 1 is loaded into the pipeline register. This is the conditional branch. The pipeline outputs the controls to the condition select logic which switches the MUX to pass the branch address. At the instant that the clock edge comes up, the µPC is loaded with the address of microinstruction i + 2. As soon as the outputs are available, if the MUX has not yet switched, address i + 2 will be sent to memory.

The read access time of the PROM is greater than the propagation delay of the path through the pipeline, condition MUX, and next-address MUX, and is greater than the µPC register setup time and the propagation delay of its output through the next-address MUX. Any fluttering of the address inputs occurring from the start of a fetch of the branch address is irrelevant, since the memory output is not sensed until the next clock. Therefore, during the third cycle, the branch address is fetched.

The incrementer now contains the address following the branch address. On the next clock, execution proceeds with no flushing of the pipe and with no extraordinary idle times. This is the desired CCU design.

The cycle time is now (use Netscape!!!): [all times are maximum worst-case]

C_P = t_{pipeline clock to output} + t_{propagate
cond. logic} + t_{propagate next-address MUX} + t_{PROM
read access} + t_{register setup (pipeline)}

Or,

C_P = t_{setup µPC} + t_{propagate
next-address MUX} + t_{PROM read access} + t_{register setup (pipeline)}

Or,

C_P = t_{pipeline clock to output} + t_{ALU
execution} + t_{resister setup (ACC, Status)}

whichever is longer (whichever is the critical path).

Bit-Slice Design: Controllers and ALUs

by Donnamaie E. White

Simple Controller continued

Improved Architecture

Copyright © September 1996, 1999, 2001, 2002 Donnamaie E. White White Enterprises