Chapter 9 Study Notes – Fundamental Sorting Algorithms

Overview of Sorting Algorithms

  • Sorting = systematic rearrangement of a sequence according to an ordering relation (numerical ↑/↓, lexicographical, etc.)
  • Corner-stone of algorithm design; exposes ideas such as divide-and-conquer, data-structures interaction and randomisation.
  • Typical scenario: records contain many fields but we sort only on a key (e.g. Name, Class, Examscore → key = Examscore).
  • Real-world uses:
    • Electronic mail clients – sort by date, subject, sender…
    • Telephone directories – alphabetical names for fast lookup.
    • Dictionaries – lexicographic ordering of words.

Fundamental Operations

  • Every comparison-based sort repeatedly performs two primitives:
    • Compare – determines relative order of two items.
    • Swap – exchanges misplaced items, implemented with a temporary variable:
    • Step 1 tempAtemp \leftarrow A, Step 2 ABA \leftarrow B, Step 3 BtempB \leftarrow temp – prevents loss of the overwritten original value.

Bubble Sort

  • Concept: adjacent-pair comparisons; larger values “bubble” to the end.
  • Descriptive algorithm:
    • For each pass i=0N2i=0\dots N-2
    • Compare A[j]A[j] with A[j+1]A[j+1] for j=0N2ij=0\dots N-2-i, swap if out of order.
  • Worked example – [4,7,3,1,6][4,7,3,1,6] turns into [1,3,4,6,7][1,3,4,6,7] after 4 passes (detailed pair-by-pair trace included in transcript).
  • Python (basic) implementation:
for i in range(n-1):
    for j in range(n-1-i):
        if A[j] > A[j+1]:
            A[j],A[j+1] = A[j+1],A[j]
  • Optimisations
    • Keep flag swapFlag; if a complete pass produces no swap → already sorted → terminate early.
    • Last ii elements are fixed after pass ii, therefore inner loop length shortens by ii.
  • Worst-case passes: after N1N-1 passes the remaining element must also be in place because all N1N-1 others are.

Insertion Sort

  • Analogy: insert new playing card into already ordered hand.
  • Maintains two partitions:
    1. Left part sorted (initially first element).
    2. Right part unsorted.
  • For each element in unsorted part:
    • Scan sorted part from right to left until correct position found; shift larger elements rightward; insert.
  • Iterative pseudocode provided (variables current, i, j loops) and equivalent recursive version.
  • Trace example on [40,35,22,16,37][40,35,22,16,37] shows progressive growth of sorted prefix.

Memory Management: In-place vs Not-in-place

  • In-place sort: uses only O(1)O(1) auxiliary space (apart from small constants). E.g. bubble, insertion, quick.
  • External / not-in-place: requires extra arrays or files; used when data don’t all fit in RAM. Classic example: merge sort.
  • Comparative table:
    • In-place advantages – memory efficiency, good for nearly sorted data.
    • In-place drawbacks – side effects, possible instability, poor big-data performance for O(n2)O(n^2) variants.
    • Not-in-place advantages – stability (merge sort), often simpler implementation.
    • Not-in-place drawbacks – extra memory, overhead for small inputs.

Stability in Sorting

  • Stable ⇒ equal keys keep original relative order.
  • Bubble and insertion sorts are stable; quicksort (standard forms) often not, merge sort stable.
  • Importance: multi-key sorting (e.g. keep earlier order while resorting on another key) and reporting word-frequency counts while maintaining alphabetic order among equal counts.

Quick Sort (Partition-Exchange)

  • Invented by Tony Hoare (1959). Divide-and-conquer, very fast on average.
  • High-level steps:
    1. Choose pivot pip_i.
    2. Partition array into < pivot and > pivot (elements equal to pivot can be placed either side or grouped).
    3. Recursively quicksort left and right parts.
    4. Base case = subarray size 1\le 1.
  • Pivot selection strategies: first, last, middle, median, random – choice affects performance.
  • Partitioning (first-element pivot) pseudocode given – indices i (pivot position) and j scan.
  • Full trace on array [9,5,8,2,6,23][9,5,8,2,-6,23] provided (chronological recursive call list and array states).
  • Stability test: introduce duplicate keys with unique IDs to see if order retained.
  • Alternate quicksort: choose middle element pivot; scans inward from both ends using one recursive subroutine.
  • Not-in-place version (Python) builds two new sublists (less-than, greater-than) and concatenates recursively – simpler but O(n)O(n) extra space.

Merge Sort

  • Efficient stable sort using divide-and-conquer; repeatedly halves list until singletons, then merges back.
  • Process:
    • Divide: recursively split array until size 11.
    • Conquer: merge two sorted halves into larger sorted list.
  • Illustration on [38,16,27,39,12,27][38,16,27,39,12,27] and on [38,27,3,9,82,7,24][38,27,3,9,82,7,24] with tree diagrams.
  • Merging rule: repeatedly compare first elements of two queues and append smaller to output list; when one queue empty, append remainder of other.
  • Not-in-place merge sort pseudocode:
    • mergesort() returns new array; merge() builds target array cc with three while-loops for leftover items.
  • Exercises require Python translation, tree tracing, discussion of not-in-place nature (creation of new arrays list1, list2, c).
  • In-place merge sort (advanced): keep smallest elements in first subarray by swapping with B[0] and re-inserting B[0] into correct place of B – repeatedly produces fully sorted A + B without external buffer.
    • Example walk-through starting A=[9,11,12], B=[1,2,3,4] results in combined [1,2,3,4,9,11,12].

Complexity & Pass Counts (Quick Reference)

  • Bubble sort: worst/average O(n2)O(n^2) time, best O(n)O(n) if early-exit used; space O(1)O(1).
  • Insertion sort: worst/average O(n2)O(n^2), best O(n)O(n) when nearly sorted; space O(1)O(1).
  • Quicksort: average O(nlogn)O(n\log n), worst O(n2)O(n^2) (bad pivots); space O(1)O(1) extra in-place, O(n)O(n) not-in-place.
  • Merge sort: always O(nlogn)O(n\log n) time; space O(n)O(n) (standard) or O(1)O(1) (in-place variant).

Coding & Exercise Highlights

  • Improved bubbleSortBetter pseudocode uses REPEAT-UNTIL with swapFlag and decrements n each pass.
  • Python exercise array A=[5,3,1,9,8,2,4,7] – number of comparisons required as output.
  • Recursive bubbleSortRecur template provided – correct call should swap arr[j], arr[j+1].
  • Bubble-sort pseudocode fill-in ( swap lines; UNTIL Posn = ArraySize-Pass).
  • Merge sort worksheet tasks: create task11(list) returning sorted list; task12(infile,outfile) reading admissions, writing sorted list, then searching.
  • Top-down mergeSort(ar,low,high) trace: list all function calls until size 1 segments; answer counts self-invocations.

Ethical / Practical Notes

  • Swapping changes original memory; programmers must beware of inadvertent data loss when stability or previous order is semantically relevant.
  • Choice between in-place and not-in-place influenced by available RAM, data volume, and requirement for stability.
  • For huge datasets external ‑> use merge sort with disk files (k-way merges), map-reduce frameworks, etc.