12_CPSC 335: Algorithm Engineering - Divide and Conquer: Merge Sort & Binary Search
Divide and Conquer Pattern: Merge Sort
- The best-known example of a divide-and-conquer algorithm.
- Invented by John von Neumann in 1945.
- Input: A list U of n comparable elements, U_0, …, U[n-1].
- Output: A list S containing the elements of U, in non-decreasing order.
- Follows the divide-and-conquer paradigm for sorting:
- Divide: Split V down the middle into two subsequences, each of size roughly n/2.
- Conquer: Sort each subsequence (by calling merge_sort recursively on each).
- Combine: Merge the two sorted subsequences into a single sorted list (by calling merge).
Merge Sort General Pseudocode
General Pseudocode:
def merge_sort(V): if <THIS IS A BASE CASE>: return <BASE CASE SOLUTION> else: L, R = <DIVIDE V INTO TWO INSTANCES OF ROUGHLY EQUAL SIZE> sorted_L = merge_sort(L) sorted_R = merge_sort(R) V = <COMBINE sorted_L WITH sorted_B> return sorted_VThe base case is when the vector V contains a single element (len(V) = 1).
The key operation is
<COMBINE sorted_L WITH sorted_R>, which merges together two sorted lists into a single sorted list. Simply combiningsorted_Lwithsorted_Rgives the wrong output.The merge-sort procedure:
def merge_sort(V): if len(V) <= 1: return V else: half = len(V) / 2 L = V[0] ... V[half-1] R = V[half] ... V[n-1] sorted_L = merge_sort(L) sorted_R = merge_sort(R) sorted_V = merge(sorted_L, sorted_R) return sorted_V def merge(L, R): return <COMBINE L WITH R>Merging the two sorted sub-arrays is always challenging.
A sub-algorithm called
mergeis needed.
Merging the Sub-arrays
As in Sequential Search,
<FIND THE LEAST UNSORTED ELEMENT IN L AND R>step can be done in O(n) time.But elements in L and R are already sorted. Can it be done in O(1) time?
Procedure:
def mergeProcedure(L, R): S = Vector() while <UNSORTED ELEMENTS REMAIN IN L OR R>: x = <FIND THE LEAST UNSORTED ELEMENT IN L AND R> <REMOVE x FROM L/R> S.add_back(x) return SWhere might the least element among L and R be located?
if L[0] <= R[0]: x = L[0] else: x = R[0]Implementation:
def mergeProcedure(L, R): S = Vector() li = ri = 0 while li < len(L) and ri < len(R): if L[li] <= R[ri]: S.add_back(L[li]) li += 1 else: S.add_back(R[ri]) ri += 1 for i in range(li, len(L)): S.add_back(L[i]) for i in range(ri, len(R)): S.add_back(R[i]) return SOne of the for loops will never iterate.
The while loop ends when one of L or R is depleted of elements; the other still has elements that need to go into S.
The worst-case time complexity of the mergeProcedure is O(n).
Merge Sort Example
- Illustrative examples of merge sort with unsorted array being divided, sorted and merged to generate a sorted array.
- Example of merge procedure
- 72845613 is split into 7524 and 1630 then further split and merged to get 2457 and 0136 and finally 01234567
Analysis of Merge Sort
- The time complexity of the Merge procedure is O(n).
- The time complexity of MergeSort algorithm when called with input n and L is O(n \log n).
- Proof: We can use the master theorem.
- Let T(n) be the running time of MergeSort procedure when called with input n and L:
- For n \le 1, then T(n) = 2.
- For n > 1, then T(n) = 1 + T(\frac{n}{2}) + T(\frac{n}{2}) + 2n + 1 \le 2 \cdot T(\frac{n}{2}) + 2n + 2.
- The recursive case spends:
- 1 step on the if,
- 1 step computing
half, - O(n) time copying V into two smaller vectors L and R,
- O(n) time in
merge, - finally makes one recursive call on each of L and R.
- Therefore, the time complexity of the recursive case is T(n) = 1 + 1 + n + n + T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) \le 2 \cdot T(\lceil n/2 \rceil) + 2n + 2
- Thus the function T(n) is a master-form recurrence:
T(n) \le \begin{cases} 2 & n < 2 \ 2T(n/2) + 2n + 2 & n \ge 2 \end{cases} - 2n + 2 \in O(n), thus k=1 r= 2 d= 2, k = 1
- All t, r, d, k \in O(1).
- Since r = d^k (case 2), we apply the master theorem and we obtain: O(n^k \log n) = O(n \log n).
Merge Sort is a Stable Sorting Algorithm
- When a sorting algorithm has the property that equal items appear in the final sorted list in the same relative order that they appeared in the initial input, it is called a stable sorting algorithm.
- Why stable sorting algorithms are preferred:
- Some files for sorting have many attributes (name, SSN, grade, etc.). The list may already have been sorted on one attribute (say, name).
- If we sort on a second attribute (say, grade), individuals with the same grade, remain sorted by name.
- By favoring elements from the left sublist over the right, we preserve the relative order of elements.
Binary Search
- Binary search algorithm searches for an element within an ordered list of comparable elements.
- The binary search problem is: input: a vector V of n comparable elements in non-decreasing order, and a query value q output: the index i such that V[i] = q, or None if no such index exists
- Related to sequential search: the input is a query value q and a vector V that may or may not contain q.
- Applied to lists that are already sorted.
- Allows for an algorithm that is faster than sequential search:
- Sequential search runs in O(n) time
- The time complexity of Binary-Search(V, 1, n, q) is O(\log n)
- Allows for an algorithm that is faster than sequential search:
- Follows the divide-and-conquer paradigm
- Divide: the n-element vector V[start .. end], start=0 and end=n-1, into two subsequence of n/2 length, V[start .. m ] and V[ m+1.. end ] where m=(start+end)/2
- Conquer: compare the middle element of the sequence with q and search the correct subsequence out of the two subsequences using binary-search
- Combine: nothing
Binary Search Algorithm
Example
Implementation:
def binary_search(V, q): return binary_search_range(V, q, 0, len(V)) def binary_search_range(V, q, s, e): if s == e: return None else: half = (s + e) / 2 if q < V[half]: return binary_search_range(V, q, s, half) elif q == V[half]: return half else: return binary_search_range(V, q, half+1, e)
Analysis of Binary Search
- For n \le 1, then T(n) = 2.
- For n > 1, then T(n) \le T(\frac{n}{2}) + 5
- Thus the function T(n) is a master-form recurrence:
T(n) \le \begin{cases} 2 & n < 2 \ T(n/2) + 5 & n \ge 2 \end{cases} - 5 \in O(1), thus k=0 r= 1 d= 2, k =0
- All r, d, k \in O(1).
- Since r= d^k (case 2), we apply the master theorem and we obtain O(n^k \log n) = O(\log n)$$.
Indivisible Problems
- The decrease-by-a-fraction pattern cannot be used to solve all problems.
- The viability of the pattern depends on two properties of the problem at hand:
- It must always be possible to divide any instance into two sub-instances of roughly equal size; and
- It must always be possible to combine the solutions to those sub-instances into a correct output for the entire original instance.
Example of Indivisible Problems
- Traveling Salesman Problem: It is not always possible to divide a graph with n vertices and m edges into two smaller subgraphs with ≈ n/2 vertices and ≈ m/2 edges,
- since edges need not be distributed evenly throughout the graph, and some of the graph’s edges may straddle the boundary between the two subgraphs.
- There are, however, some algorithms for finding separators of graphs, but they are beyond the scope of this course
- A problem’s structure may make it impossible to compute decrease_by_half(L) and decrease_by_half(R) independently of one another.
- Some problems involve “global” interdependencies that necessitate processing an entire instance all at once rather than piecemeal.
- A good example is the circuit satisfaction problem.