Chapter 9 Study Notes – Fundamental Sorting Algorithms

Sorting = systematic rearrangement of a sequence according to an ordering relation (numerical ↑/↓, lexicographical, etc.)
Corner-stone of algorithm design; exposes ideas such as divide-and-conquer, data-structures interaction and randomisation.
Typical scenario: records contain many fields but we sort only on a key (e.g. Name, Class, Examscore → key = Examscore).
Real-world uses:
- Electronic mail clients – sort by date, subject, sender…
- Telephone directories – alphabetical names for fast lookup.
- Dictionaries – lexicographic ordering of words.

Every comparison-based sort repeatedly performs two primitives:
- Compare – determines relative order of two items.
- Swap – exchanges misplaced items, implemented with a temporary variable:
- Step 1 $temp \leftarrow A$ , Step 2 $A \leftarrow B$ , Step 3 $B \leftarrow temp$ – prevents loss of the overwritten original value.

Concept: adjacent-pair comparisons; larger values “bubble” to the end.
Descriptive algorithm:
- For each pass $i=0\dots N-2$
- Compare $A[j]$ with $A[j+1]$ for $j=0\dots N-2-i$ , swap if out of order.
Worked example – $[4,7,3,1,6]$ turns into $[1,3,4,6,7]$ after 4 passes (detailed pair-by-pair trace included in transcript).
Python (basic) implementation:

for i in range(n-1):
    for j in range(n-1-i):
        if A[j] > A[j+1]:
            A[j],A[j+1] = A[j+1],A[j]

Optimisations
- Keep flag swapFlag; if a complete pass produces no swap → already sorted → terminate early.
- Last $i$ elements are fixed after pass $i$ , therefore inner loop length shortens by $i$ .
Worst-case passes: after $N-1$ passes the remaining element must also be in place because all $N-1$ others are.

Analogy: insert new playing card into already ordered hand.
Maintains two partitions:
1. Left part sorted (initially first element).
2. Right part unsorted.
For each element in unsorted part:
- Scan sorted part from right to left until correct position found; shift larger elements rightward; insert.
Iterative pseudocode provided (variables current, i, j loops) and equivalent recursive version.
Trace example on $[40,35,22,16,37]$ shows progressive growth of sorted prefix.

In-place sort: uses only $O(1)$ auxiliary space (apart from small constants). E.g. bubble, insertion, quick.
External / not-in-place: requires extra arrays or files; used when data don’t all fit in RAM. Classic example: merge sort.
Comparative table:
- In-place advantages – memory efficiency, good for nearly sorted data.
- In-place drawbacks – side effects, possible instability, poor big-data performance for $O(n^2)$ variants.
- Not-in-place advantages – stability (merge sort), often simpler implementation.
- Not-in-place drawbacks – extra memory, overhead for small inputs.

Stable ⇒ equal keys keep original relative order.
Bubble and insertion sorts are stable; quicksort (standard forms) often not, merge sort stable.
Importance: multi-key sorting (e.g. keep earlier order while resorting on another key) and reporting word-frequency counts while maintaining alphabetic order among equal counts.

Invented by Tony Hoare (1959). Divide-and-conquer, very fast on average.
High-level steps:
1. Choose pivot $p_i$ .
2. Partition array into < pivot and > pivot (elements equal to pivot can be placed either side or grouped).
3. Recursively quicksort left and right parts.
4. Base case = subarray size $\le 1$ .
Pivot selection strategies: first, last, middle, median, random – choice affects performance.
Partitioning (first-element pivot) pseudocode given – indices i (pivot position) and j scan.
Full trace on array $[9,5,8,2,-6,23]$ provided (chronological recursive call list and array states).
Stability test: introduce duplicate keys with unique IDs to see if order retained.
Alternate quicksort: choose middle element pivot; scans inward from both ends using one recursive subroutine.
Not-in-place version (Python) builds two new sublists (less-than, greater-than) and concatenates recursively – simpler but $O(n)$ extra space.

Efficient stable sort using divide-and-conquer; repeatedly halves list until singletons, then merges back.
Process:
- Divide: recursively split array until size $1$ .
- Conquer: merge two sorted halves into larger sorted list.
Illustration on $[38,16,27,39,12,27]$ and on $[38,27,3,9,82,7,24]$ with tree diagrams.
Merging rule: repeatedly compare first elements of two queues and append smaller to output list; when one queue empty, append remainder of other.
Not-in-place merge sort pseudocode:
- mergesort() returns new array; merge() builds target array $c$ with three while-loops for leftover items.
Exercises require Python translation, tree tracing, discussion of not-in-place nature (creation of new arrays list1, list2, c).
In-place merge sort (advanced): keep smallest elements in first subarray by swapping with B[0] and re-inserting B[0] into correct place of B – repeatedly produces fully sorted A + B without external buffer.
- Example walk-through starting A=[9,11,12], B=[1,2,3,4] results in combined [1,2,3,4,9,11,12].

Bubble sort: worst/average $O(n^2)$ time, best $O(n)$ if early-exit used; space $O(1)$ .
Insertion sort: worst/average $O(n^2)$ , best $O(n)$ when nearly sorted; space $O(1)$ .
Quicksort: average $O(n\log n)$ , worst $O(n^2)$ (bad pivots); space $O(1)$ extra in-place, $O(n)$ not-in-place.
Merge sort: always $O(n\log n)$ time; space $O(n)$ (standard) or $O(1)$ (in-place variant).

Improved bubbleSortBetter pseudocode uses REPEAT-UNTIL with swapFlag and decrements n each pass.
Python exercise array A=[5,3,1,9,8,2,4,7] – number of comparisons required as output.
Recursive bubbleSortRecur template provided – correct call should swap arr[j], arr[j+1].
Bubble-sort pseudocode fill-in ( swap lines; UNTIL Posn = ArraySize-Pass).
Merge sort worksheet tasks: create task11(list) returning sorted list; task12(infile,outfile) reading admissions, writing sorted list, then searching.
Top-down mergeSort(ar,low,high) trace: list all function calls until size 1 segments; answer counts self-invocations.

Swapping changes original memory; programmers must beware of inadvertent data loss when stability or previous order is semantically relevant.
Choice between in-place and not-in-place influenced by available RAM, data volume, and requirement for stability.
For huge datasets external ‑> use merge sort with disk files (k-way merges), map-reduce frameworks, etc.