Answer:
The algorithm is very similar to that of counting inversions. The only change is that here we separate the counting of significant inversions from the merge-sort process.
Step-by-step explanation:
Algorithm:
Let A = (a1, a2, . . . , an).
Function CountSigInv(A[1...n])
if n = 1 return 0; //base case
Let L := A[1...floor(n/2)]; // Get the first half of A
Let R := A[floor(n/2)+1...n]; // Get the second half of A
//Recurse on L. Return B, the sorted L,
//and x, the number of significant inversions in $L$
Let B, x := CountSigInv(L);
Let C, y := CountSigInv(R); //Do the counting of significant split inversions
Let i := 1;
Let j := 1;
Let z := 0;
// to count the number of significant split inversions while(i <= length(B) and j <= length(C)) if(B[i] > 2*C[j]) z += length(B)-i+1; j += 1; else i += 1;
//the normal merge-sort process i := 1; j := 1;
//the sorted A to be output Let D[1...n] be an array of length n, and every entry is initialized with 0; for k = 1 to n if B[i] < C[j] D[k] = B[i]; i += 1; else D[k] = C[j]; j += 1; return D, (x + y + z);
Runtime Analysis: At each level, both the counting of significant split inversions and the normal merge-sort process take O(n) time, because we take a linear scan in both cases. Also, at each level, we break the problem into two subproblems and the size of each subproblem is n/2. Hence, the recurrence relation is T(n) = 2T(n/2) + O(n). So in total, the time complexity is O(n log n).
Correctness Proof: We will prove a stronger statement that on any input A our algorithm produces correctly the sorted version of A and the number of significant inversions in A. And, we will prove it by induction. Base Case: When n = 1, the algorithm returns 0, which is obviously correct.
Inductive Case: Suppose that the algorithm works correctly for 1, 2, . . . , n β 1. We have to show that the algorithm also works for n. By the induction hypothesis, B, x, C, y are correct. We need to show that the counting of significant split inversions and the normal merge-sort process are both correct. In other words, we are to show that z is the correct number of significant split inversions, and that D is correctly sorted.
First, we prove that z is correct. Due to the fact that B and C are both sorted by the induction hypothesis, for every C[j], where j = 1, . . . , length(C), we can always find the smallest i such that B[i] > 2C[j]. Therefore, B[1], B[2], . . . , B[i β 1] are all smaller than or equal to 2C[j], and B[i], B[i + 1], . . . , B[length(B)] are all greater than 2C[j]. Thus, length(B) β i + 1 is the correct significant split inversion count for pairs involving C[j]. After counting the number of correct significant split inversions for all C[j], where j = 1, . . . , length(C), z is the correct number of significant split inversions. Second, we show that D is the sorted version of A.
This is actually the proof of merge-sort. Each time, we append one element from B or C to D. At each iteration k, since B and C are both sorted by the induction hypothesis, min(B[i], C[j]), the one to be appended to D, is also the smallest number among all the numbers that have not been added to D yet.
As a result, every time we append the smallest remaining element (of B and C) to D, which makes D a sorted array