199k views
4 votes
consider a memory system with a single cycle cache and 200 cycle latency dram with the processor and dram both operating at 1 ghz. assume that the processor has four multiply add units and is capable of executing eight instructions in each cycle of 1 ns. consider the problem of computing the dot product of two vectors on such a platform. a dot-product computation performs one multiply-add (2 flops) on a single pair of vector elements, i.e., each floating point operation requires one data fetch.

User Bobojam
by
7.9k points

1 Answer

6 votes

Answer:

To compute the dot product of two vectors on this platform, the processor will need to fetch the vector elements from the DRAM, perform the multiply-add operation, and write the result back to DRAM. Since the DRAM has a latency of 200 cycles, and each floating point operation requires a data fetch, the processor will spend 200 cycles waiting for the data to arrive before it can start processing. Each cycle of the processor will result in the execution of 8 instructions, so it can perform 8 multiply-add operations in one cycle. Given that the processor has 4 multiply-add units, it can perform 4 multiply-add operations in parallel, meaning it can perform a total of 32 floating point operations per cycle. The time it takes to compute the dot product will depend on the number of elements in the vectors, but assuming N elements, the computation will take N / 32 cycles.

Hope this can help!

User Chirag Satapara
by
8.1k points