57.7k views
5 votes
Programmers should be familiar with message passing interface and how to distinguish between distributed memory and message passing systems. Point-to-point communication time of a specific system can be found using the ping-pong method. One process P0, is made to send a message to another process, say P1. Immediately upon receiving the message, P1 sends a message back to P0. The time involved in this message communication is recorded at P0. Then this time is divided by two to obtain an estimate of the time of one-way communication. Write a program in OpenMP or CUDA that explores message passing interface and how a distributed memory system would also improve the ping-pong method. Refer to the "CST-550 Sample MPI Program," located within the Topic Resources. Measure the communication times. You can time a ping-pong program using the C clock function on your system. Then answer the following: How long does the code have to run before the clock gives a nonzero run-time? How do the times you got with the clock function compare to times taken with OpenMP or CUDA timing API? Explain your answer.

1 Answer

4 votes

Final answer:

The ping-pong method is used in distributed memory systems to measure point-to-point communication time. Timing accuracy can be influenced by the observer's relative motion and simultaneity. A comparison between the C clock function and OpenMP or CUDA timing APIs can show differences in measurement precision.

Step-by-step explanation:

Programmers utilizing message passing interfaces (MPI) in distributed memory systems often use the ping-pong method to measure point-to-point communication time. This technique involves having one process (P0) send a message to another (P1), which immediately sends a message back, with the time recorded and halved to estimate one-way communication time. For timing accuracy, electronic detection systems can provide more precise measurements than manual methods, such as stopwatches, which are subject to human reaction times. It is important to note that timing measurements can be affected by the relative motion of the observer, a concept known as simultaneity, which acknowledges that the measurement of elapsed time may vary based on how it is measured and the observer's frame of reference.

When considering how long code must run before the C clock function returns a nonzero run-time, it depends on the resolution of the system clock. If the resolution is too low, the code might need to run for a considerable time before a nonzero value is obtained. Comparing timing results from the C clock function with those from OpenMP or CUDA timing APIs can reveal differences due to the various resolutions and accuracies each method provides, with specialized APIs potentially offering higher precision and lower overheads.

User Brown
by
7.1k points