28.4k views
5 votes
You are designing a PMD and optimizing it for low energy. The core, including an 8 KB L1 data cache, consumes 1 W whenever it is not in hibernation. If the core has a perfect L1 cache hit rate, it achieves an average CPI of 1 for a given task, that is, 1000 cycles to execute 1000 instructions. Each additional cycle accessing the L2 and beyond adds a stall cycle for the core. Based on the following specifications, what is the size of L2 cache that achieves the lowest energy for the PMD (core, L1, L2, memory) for that given task?

a. The core frequency is 1 GHz, and the L1 has an MPKI of 100.
b. A 256 KB L2 has a latency of 10 cycles, an MPKI of 20, a background power of 0.2 W, and each L2 access consumes 0.5 nJ.
c. A 1 MB L2 has a latency of 20 cycles, an MPKI of 10, a background power of 0.8 W, and each L2 access consumes 0.7 nJ.
d. The memory system has an average latency of 100 cycles, a background power of 0.5 W, and each memory access consumes 35 nJ.

User Carl Meyer
by
8.5k points

1 Answer

4 votes

Final answer:

To find the L2 cache size that achieves the lowest energy consumption for the PMD, one must calculate the total energy used by each L2 cache option, accounting for both access energy and background power, taking into account the respective Misses Per Kilo Instructions (MPKI) and the energy consumption for memory accesses.

Step-by-step explanation:

The goal here is to calculate the lowest energy consumption for the specified Performance Monitoring Device (PMD), taking into account the parameters of the core, L1 cache, and choices of L2 cache against memory accesses. We know that the core with the L1 cache consumes 1 W. We need to consider additional energy consumption from either L2 caches or memory when a cache miss occurs.

Let's calculate the energy consumption with both L2 cache options provided:

  • For the 256 KB L2 cache, the MPKI is 20, thus we expect 20 misses per 1000 instructions. With 10 additional cycles per miss, that's 200 extra cycles. Multiplying this by the background power (0.2 W) and the per-access energy (0.5 nJ), considering energy is power × time, gives us the total energy spent due to L2 cache misses.
  • For the 1 MB L2 cache, the MPKI is 10, meaning 10 misses per 1000 instructions, with 20 additional cycles per miss, adding up to 200 extra cycles. Similarly, we multiply this by the higher background power (0.8 W) and the higher per-access energy (0.7 nJ) to obtain the total energy consumed by this cache level.

For misses that reach memory, the memory system consumes a fixed 35 nJ per access with a background power of 0.5 W and latency of 100 cycles. Calculating separately for each case based on their respective MPKI will yield total energy values attributable to memory accesses.

Comparing all the calculated values for both L2 caches and the memory, we choose the L2 cache size that results in the lowest total energy consumption for running the given task. It is important to consider both the energy used during accesses and the background power consumption when the caches are idle.

It is worth noting that this question assumes a perfect L1 hit rate and ignores memory accesses beyond L2 for the sake of simplification. In reality, perfect hit rates are unlikely, and other factors, such as dynamic and leakage power, would also need consideration in a full system-level energy model.

User Satya Attili
by
8.6k points