Study Notes for Chapter 2: Algorithms from Introduction to Algorithms

Introduction to Algorithms

  • Introduction to the foundational textbook Introduction to Algorithms authored by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.

  • Notable points:

    • Over 1 million copies sold worldwide.

    • Fourth edition published.

    • Contributor: Manijeh Keshtgari, School of Computing, UGA.

Chapter 2 Overview

Goals of the Chapter

  • Start using frameworks for describing and analyzing algorithms.

  • Examine two algorithms for sorting: insertion sort and merge sort.

  • Learn to describe algorithms in pseudocode.

  • Begin employing asymptotic notation for running-time analysis.

  • Understand the technique of divide and conquer in the context of merge sort.

Algorithms Defined

  • Algorithm: A step-by-step computational procedure characterized by:

    • Defined input and output.

    • Solves well-specified problems in a finite amount of time.

    • Can be implemented through code, written procedures, or hardware circuits.

    • Expected to halt and yield correct results on all valid inputs.

Importance of Algorithms

  • Essential for solving complex problems efficiently.

  • More crucial than hardware speed for long-term performance gains.

  • Efficient algorithms can outperform brute-force methods even on slower machines.

  • Example comparison:

    • Merge sort with time complexity O(n ext{ log } n)

    • Insertion sort with time complexity O(n^2)

The Sorting Problem

Definition

  • Input: Sequence 1, a2, au, a_n> of numbers.

  • Output: A permutation <a'1, a'2, au, a'n> such that a'1 < a'2 < au < a'n.

Example of Sorting
  • Input: 8 2 4 9 3 6

  • Output: 2 3 4 6 8 9

Insertion Sort

Description

  • A simple sorting algorithm suited for small number of elements.

  • Analogy: Sorting a hand of playing cards:

    • Start with an empty left hand; cards face down.

    • Remove one card at a time from the table and insert it into the correct position in the left hand.

    • To find the correct position for a card, compare it with cards already in hand from right to left.

    • The left hand always contains sorted cards.

Example of Insertion Sort
  • Input: 8 2 4 9 3 6

  • Process:

    • Insert 2, modifying the sequence towards the sorted form step-by-step:

    • Initial list 8 2 4 9 3 6

    • After inserting 2: 2 8 4 9 3 6

    • Continuing until sorted: 2 3 4 6 8 9

Pseudocode for Insertion Sort
INSERTION-SORT (A, n)
1 for i = 2 to n
2     key = A[i]
3     // Insert A[i] into the sorted subarray A[1: i − 1].
4     j = i - 1
5     while j > 0 and A[j] > key
6           A[j + 1] = A[j]
7           j = j - 1
8     A[j + 1] = key

Loop Invariants and Correctness

  • Loop Invariant: Conditions that hold true at the beginning of each iteration regarding the algorithm's correctness.

  • Three properties:

    • Initialization: True prior to first iteration.

    • Maintenance: Remains true at the end of each iteration if true at the start.

    • Termination: Contributes to demonstrating the algorithm’s correctness upon completion.

  • Using loop invariants is analogous to mathematical induction.

Principle of Induction

  • Mathematical Induction: To show that a property P(n) is true for all positive integers n:

    • Basis Step: Show that P(1) is true.

    • Inductive Step: Show that P(k)
      ightarrow P(k + 1) for all positive integers k by assuming the hypothesis holds for an arbitrary integer k and proving for k + 1.

Example of Induction
  • Prove: extstyle orall n, rac{n(n+1)}{2} is true.

    • Basis Step: If n=1, then true since 1(1+1)/2 = 1.

    • Inductive Step: Assuming true for k.

Analyzing Running Time of Insertion Sort

  • Based on input size n and operations performed.

  • Utilizes the RAM model (Random-Access Machine).

  • Types of Analysis:

    • Best-case.

    • Worst-case (most commonly analyzed).

    • Average-case.

  • Focus is on the order of growth, particularly the leading term.

  • Uses Theta-notation to express tight asymptotic bounds.

Best Case Analysis
  • Occurs when the array is already sorted; thus, the while loop does not run.

Worst Case Analysis
  • Occurs when the array is sorted in reverse, necessitating all comparisons in the while loop.

Worst Case Running Time Expression

  • T(n) = c1 n^2 + c2 n + c_4 (n - 1), where breakdown shows contributions of operations:

    • Analyzed as a quadratic function of n.

    • Can express as T(n) = a n^2 + b n + c for constants a, b, c.

Designing Algorithms

Incremental Approach

  • Insertion sort initiates with one element (considered sorted) and inserts subsequent elements correctly.

Divide and Conquer Approach

  • Divide: Break the problem into smaller subproblems of the same type.

  • Conquer: Solve the subproblems recursively. For smaller sizes, solve directly.

  • Combine: Merge solutions of subproblems into the overall solution.

Example: Merge Sort Algorithm
  • Merge Sort Process:

    • Divide: Split the n-element sequence into two halves.

    • Conquer: Recursively sort both subsequences using merge sort.

    • Combine: Merge sorted halves to produce the final sorted sequence.

    • Merge sort is typically faster than insertion sort for larger input sizes.

Merge Sort Pseudocode
MERGE-SORT (A, p, r)
1 if p ≥ r
2     return
3 q =     extstylerac{(p+r)}{2}
4         MERGE-SORT (A, p, q)     // Sort left half
5         MERGE-SORT (A, q + 1, r) // Sort right half
6         MERGE (A, p, q, r)       // Merge sorted halves

Analysis of Merge Sort

  • Dividing Step: Computes the midpoint; takes constant time ( ext{Θ}(1)).

  • Conquering Step: Recursively solves two subproblems each of size rac{n}{2} contributing 2Tig( rac{n}{2}ig).

  • Combining Step: Merge process takes ext{Θ}(n).

Recurrence Relation

  • For merge sort:

    • Base case time: c_1 (const for size 1)

    • Recursion for larger sizes: T(n) = 2Tig( rac{n}{2}ig) + ext{Θ}(n).

Recurrence Tree Analysis

  • For the main problem, the cost is split across levels:

    • Tree expansion until sizes reach 1.

    • Total costs dictate the overall running time, typically leading to the conclusion of ext{Θ}(n ext{ log } n).

Total Cost Summary

  • Total cost derived through level costs:

    • Costs across levels culminate to c2 n ext{ log } n + c1 n.

    • Ignore lower-order terms leading to a complexity of O(n ext{ log } n).

  • The entirety of these algorithms emphasizes efficiency, correctness, and the importance of the right choice of algorithm based on input size and nature, solidifying the importance of algorithm design in computer science.