When processor designers consider a possible improvement to the processor datapath, the decision usually depends on the cost/performance trade-off. In the following three proble…

Question

asked Mar 6, 2024 50.4k views

When processor designers consider a possible improvement to the processor datapath, the decision usually depends on the cost/performance trade-off. In the following three problems, assume that we are starting with a datapath from Figure 4.2, where I-Mem, Add, Mux, ALU, Regs, D-Mem, and Control blocks have latencies of 400 ps, 100 ps, 30 ps, 120 ps, 200 ps, 350 ps, and 100 ps, respectively, and costs of 1000, 30, 10, 100, 200, 2000, and 500, respectively. Consider the addition of a multiplier to the ALU. This addition will add 300 ps to the latency of the ALU and will add a cost of 600 to the ALU. The result will be 5% fewer instructions executed since we will no longer need to emulate the MUL instruction.

a) What is the clock cycle time with and without this improvement?

b) What is the speedup achieved by adding this improvement?

c) Compare the cost/performance ratio with and without this improvement.

IJungleBoy asked

by IJungleBoy

8.6k points

1 Answer

← Prev Question Next Question →

Ask a Question

Ilam Engl · Answer 1 · 2024-03-12T22:34:24+0000

Final answer:

The clock cycle time without the improvement is 400 ps, while with the improvement it is 420 ps. The speedup achieved by adding this improvement is approximately 0.952. The cost/performance ratio with the improvement is 1.052 instruction per cost, slightly better than the ratio without improvement at 1 instruction per cost.

Step-by-step explanation:

a) To find the clock cycle time with and without the improvement, we need to consider the latencies of each block. Without the improvement, the longest latency in the datapath is 400 ps from the I-Mem block. Therefore, the clock cycle time would be 400 ps. With the improvement, the latency of the ALU increases to 420 ps (120 ps ALU latency + 300 ps multiplier latency). In this case, the clock cycle time would be 420 ps.

b) The speedup achieved by adding this improvement can be calculated by comparing the clock cycle times. The speedup is given by: Speedup = (Clock cycle time without improvement) / (Clock cycle time with improvement). Using the values from part a, we can calculate the speedup as: Speedup = 400 ps / 420 ps = 0.952. Therefore, the speedup achieved is approximately 0.952.

c) The cost/performance ratio can be calculated by dividing the cost of the datapath by the number of instructions executed. Without the improvement, the cost is the sum of the costs of all blocks: 1000 + 30 + 10 + 100 + 200 + 2000 + 500 = 3840. The number of instructions executed is reduced by 5% when the improvement is added, so we multiply the cost by 0.95: Cost with improvement = 3840 * 0.95 = 3648. The cost/performance ratio without improvement is 3840 instructions / 3840 cost = 1 instruction per cost. With the improvement, the cost/performance ratio is 3840 instructions / 3648 cost = 1.052 instruction per cost. Therefore, the cost/performance ratio with the improvement is slightly better at 1.052 instruction per cost compared to the ratio without improvement at 1 instruction per cost.

When processor designers consider a possible improvement to the processor datapath, the decision usually depends on the cost/performance trade-off. In the following three proble…

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Final answer:

Step-by-step explanation:

Please log in or register to add a comment.

Related questions

Categories

Other Questions