160k views
2 votes
Out-of-Gap-Average: The average of a finite set of numbers is defined as the sum of all

numbers in the set divided by the size of the set. Suppose that we are given a (large) set of
numbers on a file in HDFS, with one number per line. The objective is to determine the average
value of the numbers that are closer to the extreme values in the input, without considering the
impact of the numbers that are in some gap "in-between". Write the MR pseudo-code of an
efficient algorithm to compute the OGA (Out-of-Gap-Average) of a given set of input values –
i.e., the average of all the input values which are *not* inside a given interval
[gap_lower_bound, gap_upper_bound]

1 Answer

2 votes

Final answer:

The student asked for pseudo-code for a MapReduce algorithm to compute the Out-of-Gap-Average (OGA), which is the average of numbers excluding those within a specified gap. The Map phase filters numbers outside the gap range, and the Reduce phase computes the average of these numbers. This high-level pseudo-code assumes the set of numbers is stored on an HDFS file.

Step-by-step explanation:

The Out-of-Gap-Average (OGA) is an average of a set of numbers where only the values outside a specified gap range are considered. The task is to write pseudo-code for a MapReduce algorithm that efficiently computes this average. In the Map phase, input numbers will be filtered based on whether they fall outside the gap boundaries. In the Reduce phase, the average will be calculated by adding the numbers outside the gap and dividing this sum by the count of such numbers. Assumptions include that the numbers are large and exist on an HDFS file, with one number per line

Note that the actual implementation would need environment-specific details and optimizations, which this high-level pseudo-code does not capture.The objective is to determine the averagevalue of the numbers that are closer to the extreme values in the input, without considering theimpact of the numbers that are in some gap "in-between". Write the MR pseudo-code of anefficient algorithm to compute the OGA (Out-of-Gap-Average) of a given set of input values –i.e., the average of all the input values which are *not* inside a given interval[gap_lower_bound, gap_upper_bound]

User Danbardo
by
7.3k points