Sorting

COSC220 Sorting – Part A

1. Time Complexity Examples

  • Popping an item off a stack containing N elements. (Constant)

  • Returning the last element of an array of N integers. (Constant)

  • Calculating the largest element of an array of N integers. (Linear)

  • Performing a bubble sort on an array of N integers, in the worst case. (Quadratic)

  • Returning the smallest element of a sorted array of N integers. (Constant)

  • Displaying all N elements in a sorted linked list. (Linear)

  • Performing a binary search of a sorted array of N integers. (Logarithmic)

  • Using binary recursive function to calculate the Nth Fibonacci number. (Exponential)

  • Returning the total of all elements in a two-dimensional array of size NxN. (Quadratic)

2. The Sorting Problem

  • Input: A sequence of n numbers a1, a2, …, an

  • Output: A permutation (reordering) a1’, a2’, …, an’ of the input sequence such that a1’ ≤ a2’ ≤ … ≤ an’

3. Structure of Data

4. Some Definitions

  • Internal Sort: The data to be sorted is all stored in the computer’s main memory.

  • External Sort: Some of the data to be sorted might be stored in an external, slower device.

  • In Place Sort: The amount of extra space required to sort the data is constant with the input size.

5. Insertion Sort

  • Idea: Similar to sorting a hand of playing cards.

    • Start with an empty left hand and the cards facing down on the table.

    • Remove one card at a time from the table, and insert it into the correct position in the left hand by comparing it with each of the cards already in the hand, from right to left.

    • The cards in the left hand are sorted; these cards were originally the top cards of the pile on the table.

Example of Insertion Sort
  • To insert 12, we need to make room for it by moving first 36 and then 24.

  • Visualization:

    • Initial array: [6, 10, 24, 36, 12]

    • Move 36 -> [6, 10, 24, blank, 12]

    • Move 24 -> [6, 10, blank, 36, 12]

    • Place 12 -> [6, 10, 12, 24, 36]

6. Insertion Sort Algorithm

  • Algorithm: INSERTION-SORT(A)

    1. for j ← 2 to n do

    2. key ← A[j]
      Insert A[j] into the sorted sequence A[1 . . j -1]

    3. i ← j - 1

    4. while i > 0 and A[i] > key do

    5. A[i + 1] ← A[i]

    6. i ← i − 1

    7. A[i + 1] ← key

Loop Invariant for Insertion Sort
  • Invariant: At the start of each iteration of the for loop, the elements in A[1 . . j-1] are in sorted order.

7. Analysis of Insertion Sort

  • Cost Times:

    • c1n + c2(n-1) + c4(n-1) + c5(n-1) + c6(n-1) + c7(n-1) + c8*(n-1)

    • The overall time complexity is denoted as T(n). It can also be expressed with respect to the number of times the while statement is executed:

    • tj=extnumberoftimeswhileconditionischeckedatiterationjt_j = ext{number of times while condition is checked at iteration j}

Best Case Analysis
  • Best Case: The array is already sorted.

  • In this case, A[i] ≤ key during the first iteration (when i = j -1), so:

    • tj=1t_j = 1

    • Overall time complexity:
      T(n)=an+b=O(n)T(n) = an + b = O(n)

Worst Case Analysis
  • Worst Case: The array is in reverse sorted order, resulting in:

    • tj=jt_j = j

    • Overall complexity becomes quadratic:
      T(n)=O(n2)T(n) = O(n^2)

Comparisons and Exchanges in Insertion Sort
  • Insertions:

    • Generally lead to approximately n2/2n^2/2 comparisons and exchanges in the average case.

8. Insertion Sort – Summary

  • Advantages: Good running time for “almost sorted” arrays, complexity: O(n)O(n).

  • Disadvantages: Worst and average-case complexity: O(n2)O(n^2).

9. Bubble Sort

  • Idea: Repeatedly pass through the array and swap adjacent elements that are out of order.

  • Ease of Implementation: Easier to implement than Insertion Sort, but generally slower.

Example of Bubble Sort
  • Starting state: [8, 4, 6, 9, 2, 3, 1]

    • The algorithm works by comparing adjacent pairs and swapping them as needed.

    • Progress results in a final sorted order.

10. Bubble Sort Algorithm

  • Algorithm: BUBBLESORT(A)

    1. for i ← 1 to length[A] do

    2. for j ← length[A] downto i + 1 do

    3. if A[j] < A[j -1] then exchange A[j] <-> A[j-1]

Running Time of Bubble Sort
  • Time Complexity: Generally O(n2)O(n^2) across all cases.

11. Selection Sort

  • Idea: Find the smallest element in the array and exchange it with the element at the current position, repeat for all positions.

Example of Selection Sort
  • Initial array: [8, 4, 6, 9, 2, 3, 1]

    • First, find 1 and position it in the front, then find successive smallest elements and position them.

12. Selection Sort Algorithm

  • Algorithm: SELECTION-SORT(A)

    1. n ← length[A]

    2. for j ← 1 to n - 1 do

    3. smallest ← j

    4. for i ← j + 1 to n do

    5. if A[i] < A[smallest] then smallest ← i

    6. exchange A[j] ↔ A[smallest]

Analysis of Selection Sort
  • Comparison Count: Generally results in n2/2n^2/2 comparisons, and each swap is costly.

  • Overall time complexity is O(n2)O(n^2) for all cases without regard to the order of input data.

13. Summary of Sorting Algorithms

  • Insertion Sort:

    • Design approach: Sorts in place.

    • Best Case: O(n)O(n)

    • Worst Case: O(n2)O(n^2)

  • Bubble Sort:

    • Design approach: Sorts in place.

    • Running time: O(n2)O(n^2)

  • Selection Sort:

    • Design approach: Sorts in place.

    • Running time: O(n2)O(n^2)

14. Merge Sort

  • Design Approach: Sorts by dividing the array into smaller subarrays recursively, sorts them, and then merges them back.

  • Sorting Model: Merge Sort utilizes a divide-and-conquer approach.

Merge Sort Steps
  • Divide: Splits into two half-sized subarrays.

  • Conquer: Sort these subarrays recursively.

  • Combine: Merge the two sorted halves.

15. Merge Sort Algorithm

  • Algorithm: MERGE-SORT(A, p, r)

    1. if p < r then (check base case)

    2. q ← ⌊(p + r) / 2⌋ (come up with midpoint)

    3. MERGE-SORT(A, p, q) (recursive call for left half)

    4. MERGE-SORT(A, q + 1, r) (recursive call for right half)

    5. MERGE(A, p, q, r) (combine the results)

Example of Merge Sort
  • Case: n is a power of 2; show multiple iterations of sorting through merging.

  • Splitting continues until single elements remain that can be trivially merged back into order.

16. Merging Process

  • Merging Components:

  • Inputs: An array A and subarrays defined by indices p, q, r

  • Output: A single, fully sorted array combining both halves.

Pseudocode for Merge
  • Algorithm: MERGE(A, p, q, r)

    1. Compute sizes of the two subarrays.

    2. Copy values into temporary arrays for sorting.

    3. Merge the elements from the temporary arrays back into the primary array in sorted order.

Merge Running Time Analysis
  • Overall time complexity for the merge operation remains linear: O(n)O(n).

17. Analyzing Divide-and-Conquer Algorithms

  • The running time can be expressed through recurrence relations:

    • T(n)=aT(n/b)+D(n)+C(n)T(n) = a T(n/b) + D(n) + C(n)

  • Here, D(n) is the cost of dividing the problem, C(n) is the cost to combine the solutions, and T(n) is the running time on the problem of size n.

18. Merge Sort Running Time

  • Merge sort divides the problem into two halves, each of size n/2. The recurrence relation reflects this:

    • T(n)=2T(n/2)+O(n)T(n) = 2T(n/2) + O(n)

  • This leads to the conclusion via the Master Theorem that overall running time is: O(nextlogn)O(n ext{log} n).

19. Quicksort

  • Design Approach: Sort an array by partitioning it into two halves based around a pivot.

  • Performance Dependence: Heavily dependent on how the pivot is chosen.

Quicksort Steps
  1. Partition the array: Choose a pivot and rearrange elements.

  2. Recursively sort the divided subarrays formed post-partition.

  3. Since sorting is in place, combining is trivial as they are already sorted upon return.

Quicksort Algorithm
  • Algorithm: QUICKSORT(A, p, r)

    1. if p < r then

    2. q ← PARTITION(A, p, r)

    3. QUICKSORT(A, p, q)

    4. QUICKSORT(A, q + 1, r)

Partitioning the Array
  • Various strategies for pivot selection such as the Hoare partition scheme.

Performance Considerations of Quicksort
  • Analyzing recursive calls based on partitioning yields varying time complexities ranging from best-case O(nextlogn)O(n ext{log} n), average case O(nextlogn)O(n ext{log} n), and worst-case O(n2)O(n^2).

20. Summary and Discussion of All Sort Algorithms

  • Each algorithm shows varying strengths and weaknesses based on data characteristics, implementation time, and space complexity. Merge sort stands out with its reliable performance while Quicksort offers efficient average-case performance but is sensitive to pivot selection.