Fundamental Algorithms – Order of Growth

Learning Objectives & Overview

Understand why analysing algorithm performance matters when processing complex problems or large datasets.
Be able to:
- Define and interpret “order of growth” for algorithms.
- Express time-complexity using Big-O notation.
- Recognise, derive and compare common growth-rate classes.
- Analyse best-, average- and worst-case complexities for classic search/sort algorithms.
- Apply complexity knowledge to algorithm choice, optimisation and exam questions.

The Need for Order of Growth

Even slow algorithms finish quickly on very small inputs; the real concern is performance as input size $n$ becomes large.
Running the same device with different algorithms can lead to markedly different resource consumption (time, memory).
Order-of-growth analysis allows:
- Prediction of scalability.
- Fair comparison of competing solutions on identical hardware.
- Identification of bottlenecks before deployment.

Order of Growth & Complexity Measures

Rough measure of resources as $n \to \infty$ .
Two primary resources:
- Time complexity – number of primitive operations / comparisons / steps.
- Space complexity – memory required (excluded from the syllabus here).
We generally analyse worst-case unless otherwise stated; it provides an upper bound for all inputs.

Big-O Notation

Mathematical convention: $O(f(n))$ = the set of functions that grow no faster (within a constant factor) than $f(n)$ for sufficiently large $n$ .
Ignore constants and lower-order terms; only the highest-order term dictates growth rate.
Example interpretations:
- $O(1)$ constant time (independent of $n$ ).
- $O(n)$ linear growth (proportional to $n$ ).
- $O(n\log_2 n)$ scaled logarithmic.
- $O(n^2)$ quadratic.
- $O(2^n)$ exponential.

Common Growth-Rate Functions

Ordered from slowest growth to fastest:
- Constant $O(1)$
- Logarithmic $O(\log_2 n)$
- Linear $O(n)$
- Linear-log $O(n\log_2 n)$
- Quadratic $O(n^2)$
- Polynomial $O(n^k)$ (k>2)
- Exponential $O(2^n)$

Numerical Illustration (selected values)

Table excerpt (operations required vs $n$ ):
- $n=10$ : $\log2 n \approx 3.3$ , $n\log2 n \approx 33.2$ , $n^2=100$ , $2^n=1024$ .
- $n=50$ : $\log2 n \approx 5.6$ , $n\log2 n \approx 282.2$ , $n^2=2500$ , $2^n \approx 1.13\times10^{15}$ .
Key takeaway: exponential functions explode rapidly; algorithms in that class are infeasible for large $n$ .

Worked Exercise – Counting Comparisons in a While Loop

word = input("Enter a word: ")
posn = 0
while posn < len(word):
    print(word[posn])
    posn += 1

Each iteration performs one comparison (posn < len(word)), plus one final false check.
For input of length $n$ : total comparisons $=(n+1)$ → $O(n)$ .
Specific cases:
- "magic" (5 chars): $6$ comparisons.
- "hi there" (8 chars): $9$ comparisons.

Example: `is_prime` Function

FOR i = 2 TO n
    IF n MOD i = 0 THEN RETURN FALSE
RETURN TRUE

Worst case: $n$ is prime ⇒ loop executes $n-2$ iterations.
If each iteration costs constant $C$ , total $T=(n-2)C$ .
Discard constants: $T=O(n)$ (linear).

Linear Search Complexity

Best case: first element matches ⇒ $1$ comparison ⇒ $O(1)$ .
Worst case: item at last position or absent ⇒ $n$ comparisons ⇒ $O(n)$ .

Binary Search Complexity

Works on sorted list by halving the search interval.
Best case: target equals middle element ⇒ $O(1)$ .
Worst case derivation:
- Array length after $k$ halvings: $\frac{n}{2^k}=1$ ⇒ $k=\log_2 n$ .
- Comparisons per level: $1$ .
- Total $=\log2 n$ ⇒ $O(\log2 n)$ .
Note: base-2 logarithm is required; $\log_{10}$ differs.

Bubble Sort Complexity

Worst case (reverse order):
- Passes: $(N-1)$ , each with $(N-1)$ comparisons.
- Total comparisons $=(N-1)^2=N^2-2N+1$ ⇒ $O(N^2)$ .
Best case (already sorted):
- Single pass, $(N-1)$ comparisons ⇒ $O(N)$ .

Insertion Sort Complexity

Worst case (descending input when ascending desired):
- Comparisons + swaps: $1+2+\dots+(N-1)=\frac{N(N-1)}{2}$ ⇒ $O(N^2)$ .
Best case (already sorted):
- Only $N-1$ comparisons ⇒ $O(N)$ .

Quick Sort Complexity

Worst case (pivot always min or max):
- Comparisons: $n+(n-1)+\dots+1=\frac{n(n+1)}{2}$ ⇒ $O(n^2)$ .
Best / average case (pivot = median or chosen randomly):
- Levels: $\log_2 n$ .
- Comparisons per level: $n$ .
- Total: $n\log2 n$ ⇒ $O(n\log2 n)$ .
Pivot selection notes:
- True median yields optimal split but costs linear time to find.
- Random pivot greatly reduces probability of worst case; expected $O(n\log_2 n)$ .
- For nearly-sorted data, insertion sort may outperform quicksort (best case $O(n)$ vs $O(n\log n)$ ).

Merge Sort Complexity

Divide list until size $1$ , then merge.
Recurrence: $T(n)=2T\left(\frac{n}{2}\right)+cn$ .
Solving: $T(n)=cn\log2 n + n$ ⇒ $O(n\log2 n)$ .
Characteristic features:
- Same complexity for best, average, worst cases because splitting always proceeds.
- Performance insensitive to initial order.
- Not in-place: requires extra memory for merging.

Advantages & Disadvantages of Sorting Algorithms

Bubble Sort
- + Simplicity, good for testing if list already sorted (best $O(N)$ ).
- - Very slow on large/random data ( $O(N^2)$ ), rarely used in practice.
Insertion Sort
- + Simple, efficient for very small or nearly-sorted datasets, stable.
- - $O(N^2)$ worst case, beaten by more advanced algorithms on large data.
Quick Sort
- + Fastest general-purpose in-place sort, low memory, average $O(n\log n)$ .
- - $O(n^2)$ worst case, unstable, not ideal for tiny arrays.
Merge Sort
- + Guaranteed $O(n\log n)$ , stable, optimal worst case.
- - Needs extra space; copying overhead may hurt performance.

Choosing a Sorting Method – Practical Criteria

Number of records.
Record length (long records may favour methods that move pointers rather than data).
Availability of main memory (internal vs external sort).
Existing order: nearly-sorted arrays benefit most from insertion or bubble sort variants.

Summary Table

Algorithm	Best	Worst
Bubble Sort	$O(n)$	$O(n^2)$
Insertion Sort	$O(n)$	$O(n^2)$
Quick Sort	$O(n\log_2 n)$	$O(n^2)$
Merge Sort	$O(n\log_2 n)$	$O(n\log_2 n)$

Additional qualitative analysis:

Bubble: best for almost sorted.
Insertion: generally preferable to Bubble; still special-case.
Quick: fastest on large random data if stability not required.
Merge: best theoretical guarantees but uses extra memory.

Worksheet / Exam-Style Problems

1. Recursive Fibonacci

Code:

IF n <= 1 RETURN n
ELSE RETURN fib(n-1)+fib(n-2)

(a) Time complexity: $O(\phi^{n})$ where $\phi\approx1.618$ ≈ $O(2^{n})$ (exponential).
(b) Explanation: each call spawns two more calls except at base cases; call tree resembles binary tree of height $n$ ⇒ about $2^n$ total calls.
(c) Optimisation: memoisation / dynamic programming stores intermediate Fibonacci numbers, reducing complexity to $O(n)$ time and $O(n)$ space (or $O(1)$ space with iterative solution).

2. Searching 1,000,000 Sorted Emails

(i) Binary search preferred: $O(\log_2 10^6)≈20$ comparisons vs linear search’s $10^6$ .
(ii) If new emails are appended unsorted at the end, the list becomes partially ordered. Options:
- Maintain full sort then use binary search (requires periodic re-sort).
- Keep as hybrid: binary search on sorted prefix, linear search remaining tail.
- If disorder grows, revert to full linear search or re-sort.

3. Practical Timing Task (A-Level P2)

Implement functions:
- task2_2 – insertion sort.
- task2_3 – quicksort.
Use Python timeit to measure on three text files.
Expected empirical orders of growth:
- Insertion sort runtime ∝ $N^2$ (doubling input roughly quadruples time).
- Quicksort runtime ∝ $N\log N$ (doubling input < double time).
Discuss anomalies: quicksort slower on tiny or nearly-sorted files where insertion shines.

These bullet-point notes encapsulate all definitions, derivations, numeric examples, algorithm analyses, comparisons, and worksheet prompts required to replace the original 18-page transcript.