Search and Sort Algorithms

Introduction

  • Searching is a fundamental computer science problem with applications in various fields, including databases and AI.
  • Donald Knuth's "The Art of Computer Programming" dedicates an entire volume to search and sorting algorithms.
  • Search can involve looking up data in a data structure or searching for a plan in a problem space.

Search in Lists

  • Implementation of search in lists using Haskell.
  • Requires the Eq type class for equality checks.
  • Equality tests can be expensive, but we assume constant time for simplicity.
  • The complexity is measured in terms of comparison steps.
  • Assumes that the entire data structure is already in memory.
elem :: Eq a => a -> [a] -> Bool
elem _ [] = False
elem x (y:ys) = x == y || elem x ys

Complexity Analysis of List Search

  • Best Case: The element is found at the head of the list.
    • Constant time complexity: O(1)
  • Worst Case: The element is not in the list or is at the end.
    • Linear time complexity: O(n)
  • Average Case: The element is somewhere in the middle of the list.
    • Linear time complexity: O(n)

Search in Binary Search Trees

  • Uses the Ord type class for comparisons (less than and equals).
data Tree a = Leaf | Node a (Tree a) (Tree a) deriving (Eq, Ord, Show)

elem' :: Ord a => a -> Tree a -> Bool
elem' _ Leaf = False
elem' x (Node a left right)
  | x < a     = elem' x left
  | x > a     = elem' x right
  | otherwise = True

Complexity Analysis of Binary Search Tree Search

  • Best Case: The element is found at the root node.
    • Constant time complexity: O(1)
  • Assumptions:
    • The tree is correctly ordered.
    • The tree is balanced.
  • Worst Case: The element is at the deepest level of the tree.
    • Logarithmic time complexity: O(log n), where n is the number of nodes.
  • Logarithms:
    • If b^c = a, then log_b(a) = c
    • Height of a balanced binary search tree is approximately log of the size of the tree.
  • Balanced Tree Properties:
    • A fully balanced tree of height h has 2^h - 1 elements.

Comparison

  • Binary search trees offer a significant improvement over lists for searching, especially for large datasets.
  • For 20,000,000 items, the number of steps is dramatically reduced using a binary search tree compared to a list.

Sorting

Introduction to Sorting

  • Sorting is the process of arranging items in a specific order.
  • Linear arrangement analogous to arranging books on library shelves.
  • Michael's experience of arranging books on a trolley.

Insertion Sort

  • Simple and obvious algorithm.
  • Analogy to shelving a single book in a library.

Inserting into a Sorted List

insert :: Ord a => a -> [a] -> [a]
insert x [] = [x]
insert x (y:ys)
  | x <= y    = x : y : ys
  | otherwise = y : insert x ys

Complexity Analysis of Insertion into Sorted List

  • Best Case: The element is inserted at the beginning.
    • Constant time complexity: O(1)
  • Worst Case: The element is inserted at the end.
    • Linear time complexity: O(n)

Insertion Sort Implementation

  • Uses foldr to repeatedly insert elements into a sorted list.
isort :: Ord a => [a] -> [a]
isort = foldr insert []

Complexity Analysis of Insertion Sort

  • Best Case: The list is already sorted.
    • Linear time complexity: O(n)
  • Worst Case: The list is in reverse order.
    • Quadratic time complexity: O(n^2)