2.5k views
5 votes
Consider two different implementations of the RISC-V instruction set architecture. P1 has a clock rate of 4.0 GHz and CPIs of 1, 5, and 4 for ALU, load/store and branch instructions. P2 has a clock rate of 3.6 GHz and CPIs of 2, 3, and 3 for the three classes of instructions. Given a program with a dynamic instruction count of 0E6 instructions divided into classes as follows: 30% ALU class, 40% load/store class, 30% branch class.

a) (5%) What is the global CPI for each implementation? Which implementation is faster?
b) (5%) For P1, if we can improve the CPU design so load/store instruction can have only 1 CPI. For the other two classes of instruction, CPI are the same, which are 1 and 4 respectively. What is the new clock rate of P1 if we want improve the performance of the program by 100% (half of the execution time of PI before the improvement).

1 Answer

6 votes

Final answer:

Using the provided instruction distribution and CPI for each class, P2 has a lower global CPI of 2.7 compared to P1's 3.5 and is faster. If P1's load/store CPI is improved to 1, the new global CPI becomes 1.9, and to double the performance, the clock rate must also be adjusted accordingly.

Step-by-step explanation:

To calculate the global CPI (Cycles Per Instruction) for each implementation, we can use the weight of each instruction class and multiply by its respective CPI, then sum up these values. For P1, the global CPI is:

  • (1 * 0.30) + (5 * 0.40) + (4 * 0.30) = 0.3 + 2 + 1.2 = 3.5

For P2, it is:

  • (2 * 0.30) + (3 * 0.40) + (3 * 0.30) = 0.6 + 1.2 + 0.9 = 2.7

To determine which implementation is faster, we can calculate the execution time using the formula Execution Time = (Instruction count * CPI) / Clock rate. It turns out that P2 is faster due to a lower global CPI despite a slightly lower clock rate.

For the second part, if the load/store CPI for P1 drops to 1, the new global CPI for P1 is:

  • (1 * 0.30) + (1 * 0.40) + (4 * 0.30) = 0.3 + 0.4 + 1.2 = 1.9

To achieve a 100% performance improvement, which is half the execution time, the new clock rate should double the performance. If the old performance is P and we want 2P (doubling the performance), we rearrange the formula to find the new clock rate:

  • P = (Instruction count * Old CPI) / Old Clock rate
  • 2P = (Instruction count * New CPI) / New Clock rate
  • New Clock rate = (Instruction count * New CPI) / 2P
  • Since P is a known value from the original clock rate and CPI, it can be used here to calculate the new clock rate.
User Mellanie
by
7.7k points