12_CPSC 335: Algorithm Engineering - Divide and Conquer: Merge Sort & Binary Search

Divide and Conquer Pattern: Merge Sort

  • The best-known example of a divide-and-conquer algorithm.
  • Invented by John von Neumann in 1945.
  • Input: A list U of n comparable elements, U_0, …, U[n-1].
  • Output: A list S containing the elements of U, in non-decreasing order.
  • Follows the divide-and-conquer paradigm for sorting:
    1. Divide: Split V down the middle into two subsequences, each of size roughly n/2.
    2. Conquer: Sort each subsequence (by calling merge_sort recursively on each).
    3. Combine: Merge the two sorted subsequences into a single sorted list (by calling merge).

Merge Sort General Pseudocode

  • General Pseudocode:

    def merge_sort(V):
        if <THIS IS A BASE CASE>:
            return <BASE CASE SOLUTION>
        else:
            L, R = <DIVIDE V INTO TWO INSTANCES OF ROUGHLY EQUAL SIZE>
            sorted_L = merge_sort(L)
            sorted_R = merge_sort(R)
            V = <COMBINE sorted_L WITH sorted_B>
            return sorted_V
    
  • The base case is when the vector V contains a single element (len(V) = 1).

  • The key operation is <COMBINE sorted_L WITH sorted_R>, which merges together two sorted lists into a single sorted list. Simply combining sorted_L with sorted_R gives the wrong output.

  • The merge-sort procedure:

    def merge_sort(V):
        if len(V) <= 1:
            return V
        else:
            half = len(V) / 2
            L = V[0] ... V[half-1]
            R = V[half] ... V[n-1]
            sorted_L = merge_sort(L)
            sorted_R = merge_sort(R)
            sorted_V = merge(sorted_L, sorted_R)
            return sorted_V
    
    def merge(L, R):
        return <COMBINE L WITH R>
    
  • Merging the two sorted sub-arrays is always challenging.

  • A sub-algorithm called merge is needed.

Merging the Sub-arrays

  • As in Sequential Search, <FIND THE LEAST UNSORTED ELEMENT IN L AND R> step can be done in O(n) time.

  • But elements in L and R are already sorted. Can it be done in O(1) time?

  • Procedure:

    def mergeProcedure(L, R):
        S = Vector()
        while <UNSORTED ELEMENTS REMAIN IN L OR R>:
            x = <FIND THE LEAST UNSORTED ELEMENT IN L AND R>
            <REMOVE x FROM L/R>
            S.add_back(x)
        return S
    
  • Where might the least element among L and R be located?

    if L[0] <= R[0]:
        x = L[0]
    else:
        x = R[0]
    
  • Implementation:

    def mergeProcedure(L, R):
        S = Vector()
        li = ri = 0
        while li < len(L) and ri < len(R):
            if L[li] <= R[ri]:
                S.add_back(L[li])
                li += 1
            else:
                S.add_back(R[ri])
                ri += 1
        for i in range(li, len(L)):
            S.add_back(L[i])
        for i in range(ri, len(R)):
            S.add_back(R[i])
        return S
    
  • One of the for loops will never iterate.

  • The while loop ends when one of L or R is depleted of elements; the other still has elements that need to go into S.

  • The worst-case time complexity of the mergeProcedure is O(n).

Merge Sort Example

  • Illustrative examples of merge sort with unsorted array being divided, sorted and merged to generate a sorted array.
  • Example of merge procedure
  • 72845613 is split into 7524 and 1630 then further split and merged to get 2457 and 0136 and finally 01234567

Analysis of Merge Sort

  • The time complexity of the Merge procedure is O(n).
  • The time complexity of MergeSort algorithm when called with input n and L is O(n \log n).
  • Proof: We can use the master theorem.
  • Let T(n) be the running time of MergeSort procedure when called with input n and L:
    • For n \le 1, then T(n) = 2.
    • For n > 1, then T(n) = 1 + T(\frac{n}{2}) + T(\frac{n}{2}) + 2n + 1 \le 2 \cdot T(\frac{n}{2}) + 2n + 2.
  • The recursive case spends:
    • 1 step on the if,
    • 1 step computing half,
    • O(n) time copying V into two smaller vectors L and R,
    • O(n) time in merge,
    • finally makes one recursive call on each of L and R.
  • Therefore, the time complexity of the recursive case is T(n) = 1 + 1 + n + n + T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) \le 2 \cdot T(\lceil n/2 \rceil) + 2n + 2
  • Thus the function T(n) is a master-form recurrence:
    T(n) \le \begin{cases} 2 & n < 2 \ 2T(n/2) + 2n + 2 & n \ge 2 \end{cases}
  • 2n + 2 \in O(n), thus k=1 r= 2 d= 2, k = 1
  • All t, r, d, k \in O(1).
  • Since r = d^k (case 2), we apply the master theorem and we obtain: O(n^k \log n) = O(n \log n).

Merge Sort is a Stable Sorting Algorithm

  • When a sorting algorithm has the property that equal items appear in the final sorted list in the same relative order that they appeared in the initial input, it is called a stable sorting algorithm.
  • Why stable sorting algorithms are preferred:
    • Some files for sorting have many attributes (name, SSN, grade, etc.). The list may already have been sorted on one attribute (say, name).
    • If we sort on a second attribute (say, grade), individuals with the same grade, remain sorted by name.
  • By favoring elements from the left sublist over the right, we preserve the relative order of elements.

Binary Search

  • Binary search algorithm searches for an element within an ordered list of comparable elements.
  • The binary search problem is: input: a vector V of n comparable elements in non-decreasing order, and a query value q output: the index i such that V[i] = q, or None if no such index exists
  • Related to sequential search: the input is a query value q and a vector V that may or may not contain q.
  • Applied to lists that are already sorted.
    • Allows for an algorithm that is faster than sequential search:
      • Sequential search runs in O(n) time
      • The time complexity of Binary-Search(V, 1, n, q) is O(\log n)
  • Follows the divide-and-conquer paradigm
    • Divide: the n-element vector V[start .. end], start=0 and end=n-1, into two subsequence of n/2 length, V[start .. m ] and V[ m+1.. end ] where m=(start+end)/2
    • Conquer: compare the middle element of the sequence with q and search the correct subsequence out of the two subsequences using binary-search
    • Combine: nothing

Binary Search Algorithm

  • Example

  • Implementation:

    def binary_search(V, q):
        return binary_search_range(V, q, 0, len(V))
    
    def binary_search_range(V, q, s, e):
        if s == e:
            return None
        else:
            half = (s + e) / 2
            if q < V[half]:
                return binary_search_range(V, q, s, half)
            elif q == V[half]:
                return half
            else:
                return binary_search_range(V, q, half+1, e)
    

Analysis of Binary Search

  • For n \le 1, then T(n) = 2.
  • For n > 1, then T(n) \le T(\frac{n}{2}) + 5
  • Thus the function T(n) is a master-form recurrence:
    T(n) \le \begin{cases} 2 & n < 2 \ T(n/2) + 5 & n \ge 2 \end{cases}
  • 5 \in O(1), thus k=0 r= 1 d= 2, k =0
  • All r, d, k \in O(1).
  • Since r= d^k (case 2), we apply the master theorem and we obtain O(n^k \log n) = O(\log n)$$.

Indivisible Problems

  • The decrease-by-a-fraction pattern cannot be used to solve all problems.
  • The viability of the pattern depends on two properties of the problem at hand:
    1. It must always be possible to divide any instance into two sub-instances of roughly equal size; and
    2. It must always be possible to combine the solutions to those sub-instances into a correct output for the entire original instance.

Example of Indivisible Problems

  • Traveling Salesman Problem: It is not always possible to divide a graph with n vertices and m edges into two smaller subgraphs with ≈ n/2 vertices and ≈ m/2 edges,
    • since edges need not be distributed evenly throughout the graph, and some of the graph’s edges may straddle the boundary between the two subgraphs.
    • There are, however, some algorithms for finding separators of graphs, but they are beyond the scope of this course
  • A problem’s structure may make it impossible to compute decrease_by_half(L) and decrease_by_half(R) independently of one another.
    • Some problems involve “global” interdependencies that necessitate processing an entire instance all at once rather than piecemeal.
    • A good example is the circuit satisfaction problem.