Why does pipelining increase latency




















Multi-cycle datapaths break up instructions into separate steps. Each step takes a single clock cycle Each functional unit can be used more than once in an instruction, as long as it is used in different clock cycles. It reduces the amount of hardware needed. It reduces average instruction time. The clock cycle can be much shorter. Can require less hardware. Could use a single memory for instructions and data. Ideally, all stages should be exactly the same length. Pipelining Pipeline speedup The ideal speedup from a pipeline is equal to the number of stages in the pipeline.

For branches, compare and calculate the branch destination. For taken branches, update the program counter. Writeback WB Write the result to the register file. For stores and branches, do nothing. Simple DLX operation without pipelining Datapath for the unpipelined version:. Simple DLX operation without pipelining The temporary storage locations were added to the datapath of the unpipelined machine to make it easy to pipeline.

Note that branch and store instructions take 4 clock cycles. This implementation is not optimal. Other improvements to CPI are possible but are likely to increase the clock cycle time.

Latency is given as multiples of the cycle time. If the latency of a particular instruction is one cycle, its result is available for a subsequent RAW-dependent instruction in the next cycle. In this case, a RAW-dependent instruction can be processed without any delay.

If the latency is more than one cycle, say n-cycles an immediately following RAW-dependent instruction has to be interrupted in the pipeline for n-1 cycles. There are two different kinds of RAW dependency such as define-use dependency and load-use dependency and there are two corresponding kinds of latencies known as define-use latency and load-use latency. The define-use latency of instruction is the time delay occurring after decoding and issue until the result of an operating instruction becomes available in the pipeline for subsequent RAW-dependent instructions.

If the value of the define-use latency is one cycle, and immediately following RAW-dependent instruction can be processed without any delay in the pipeline. The define-use delay of instruction is the time a subsequent RAW-dependent instruction has to be interrupted in the pipeline. The define-use delay is one cycle less than the define-use latency.



0コメント

  • 1000 / 1000