Heap Sort & Priority-Queue-Based Sorting
Algorithm X (Generic Priority-Queue–Based Sorting)
- Definition: A family of comparison–based sorting algorithms that repeatedly
- insert all n input keys into a priority queue (PQ)
- repeatedly remove the extreme key (min or max) and append it to the output.
- Running-time & space
- Worst-case depends entirely on the concrete PQ implementation.
- With a binary heap as the PQ:
• Each insert costs Θ(logn).
• Each removeMax (or removeMin) costs Θ(logn).
• Total 2n operations ⇒ nlogn+nlogn=2nlogn=Θ(nlogn).
• Space is the heap itself → O(n). - With a Fibonacci heap the theoretical bound drops to O(n+nloglogn) for insert + nlogn for delete-max, but constant factors are larger.
- Key point: The label “Algorithm X” merely refers to PQ-driven sorting; the lecture instantiates it with a heap to obtain Heapsort.
Heap-based Priority Queue ➜ Heap Sort
- When the PQ is a binary max-heap, Algorithm X specializes to Heapsort.
- Performance summary (binary heap):
- Time: O(nlogn) in all cases (best, average, worst).
- Extra space: O(1) beyond the input array when performed in-place.
- Relative speed (empirical observation):
- "Much faster than" elementary quadratic sorts (Insertion & Selection) once n≳30!–!50.
In-place Heapsort – High-level Plan
- View the input array as a complete binary tree. Indexing starts at 1 for mathematical convenience.
- Two distinct phases:
- Heap Construction (Heapify)
- Convert the unordered array into a max-heap.
- Bottom-up method: run sink (a.k.a. down-heap or heapify) on every internal node k=⌊n/2⌋,…,1.
- Runs in O(n) time (not O(nlogn)) because the lower levels are cheap.
- Sortdown
- Repeat while heap size k>1: exchange the root (max) with item k, shrink the heap boundary, and sink the new root.
- Each iteration costs Θ(logk), summing to Θ(nlogn) overall.
Detailed Algorithmic Operations
- sink(pq, k, n)
- While the current node k has at least one child (index 2k≤n):
- Let j=2k (left child).
- If right child exists and \text{pq}[j] < \text{pq}[j+1], set j=j+1.
- If the parent pq[k]≥pq[j], stop (heap order restored).
- Else exchange pq[k] and pq[j], set k=j, and continue.
- exch(pq, i, j) – constant-time swap routine.
Worked Example – Sorting the String “SORTEXAMPLE”
- Initial array (indexed 1..11):
SORTEXAMPLE - Heapify Trace (selected snapshots)
- After bottom-up sinks: XTSPLRAMOEE (valid max-heap).
- Sortdown Trace (selected snapshots)
- Exchange root with last element, sink, repeat → final array: AEELMOPRSTX.
- Visual slides 4–25 & 26–80 show every intermediate heap and array state; the key pedagogical takeaway is heap property before the unsorted prefix and sorted suffix growing right-to-left.
Java-like Reference Implementation (Lines 57-70, 78-89)
public static void sort(Comparable[] pq) {
int n = pq.length;
// heapify phase
for (int k = n/2; k >= 1; k--) // indices 1..n assumed
sink(pq, k, n);
// sortdown phase
for (int k = n; k > 1; k--) {
exch(pq, 1, k);
sink(pq, 1, k-1);
}
}
private static void sink(Comparable[] pq, int k, int n) {
while (2*k <= n) {
int j = 2*k;
if (j < n && less(pq, j, j+1)) j++;
if (!less(pq, k, j)) break;
exch(pq, k, j);
k = j;
}
}
- Uses 1-based indexing; adapt easily to 0-based arrays by arithmetic offsets.
Comparison with Selection & Insertion Sort
- Decision table (slides 85–88):
- Selection Sort: O(n2) time, O(1) space, in-place ✓, not adaptive ✗, not stable ✗.
- Insertion Sort: O(n2) time, O(1) space, in-place ✓, adaptive ✓, stable ✓.
- Heapsort: O(nlogn) time, O(1) extra space, in-place ✓, not adaptive ✗, unstable ✗ (equal keys can be permuted).
Algorithm-by-Algorithm “State After k Iterations” (slide 92)
- Selection Sort: first k positions are final & minimal so far.
- Insertion Sort: first k positions are sorted but may still shift right later.
- Heap Sort: last k positions (right end) hold the k largest keys in final order; prefix 1..n−k forms a valid max-heap.
- Merge Sort (recursive): every subarray produced by recursive calls is already sorted.
- Merge Sort (bottom-up): all runs of length 2k are sorted after pass k.
- Quick Sort (standard or 3-way): pivot element A[j] is fixed; left side < pivot, right side > pivot; 3-way adds contiguous block of = pivot.
Practical & Philosophical Notes
- In-place yet not stable: The constant-space virtue comes at the expense of re-ordering equal keys.
- Not adaptive: Even presorted input costs Θ(nlogn) because heapify already runs O(n) and every subsequent sink Θ(logn).
- Cache behavior: Poorer locality than quicksort or insertion sort; modern CPUs often find Heapsort slower than well-implemented quicksort on practical inputs.
However, it offers a worst-case guarantee without extra memory, making it valuable in resource-bounded systems. - Security relevance: Because Heapsort’s time bound is deterministic O(nlogn), it avoids the adversarial O(n2) patterns that can plague naive quicksort, useful in untrusted-input contexts.
Key Equations & Numbers to Memorize
- Worst-case running time: T(n)=nlogn+O(n)=Θ(nlogn).
- Extra memory: S(n)=O(1) (just a few index variables).
- Bottom-up heap construction cost: h=0∑⌊logn⌋2h+1n⋅h=O(n).
- Heapsort vs. quadratic sorts: limn→∞n2nlogn=0 ⇒ asymptotically faster.
Implementation Gotchas & Tips
- When using 0-based arrays in C/Java/Python, children of index i are 2i+1 and 2i+2; parent is ⌊(i−1)/2⌋.
- Always exchange root with last element in heap, not necessarily with array index n; the heap size shrinks each iteration.
- For identical keys, Heapsort’s lack of stability may or may not matter; wrap (key,value) pairs to preserve order if needed.
Summary
- Heapsort is an in-place, comparison-based sort with deterministic Θ(nlogn) runtime and O(1) extra space.
- Achieves this by: (1) linear-time bottom-up heap construction, (2) n successive delete-max operations, (3) exploiting the input array both as heap storage and as the final sorted container.
- While theoretically elegant and worst-case optimal, real-world performance can lag behind quicksort/merge sort because of cache effects and constant factors.