Search and Sort Algorithms

Introduction
  • Searching is a fundamental computer science problem with applications in various fields, including databases and AI.
  • Donald Knuth's "The Art of Computer Programming" dedicates an entire volume to search and sorting algorithms.
  • Search can involve looking up data in a data structure or searching for a plan in a problem space.
Search in Lists
  • Implementation of search in lists using Haskell.
  • Requires the Eq type class for equality checks.
  • Equality tests can be expensive, but we assume constant time for simplicity.
  • The complexity is measured in terms of comparison steps.
  • Assumes that the entire data structure is already in memory.
elem :: Eq a => a -> [a] -> Bool
elem _ [] = False
elem x (y:ys) = x == y || elem x ys
Complexity Analysis of List Search
  • Best Case: The element is found at the head of the list.
    • Constant time complexity: O(1)O(1)
  • Worst Case: The element is not in the list or is at the end.
    • Linear time complexity: O(n)O(n)
  • Average Case: The element is somewhere in the middle of the list.
    • Linear time complexity: O(n)O(n)
Search in Binary Search Trees
  • Uses the Ord type class for comparisons (less than and equals).
data Tree a = Leaf | Node a (Tree a) (Tree a) deriving (Eq, Ord, Show)

elem' :: Ord a => a -> Tree a -> Bool
elem' _ Leaf = False
elem' x (Node a left right)
  | x < a     = elem' x left
  | x > a     = elem' x right
  | otherwise = True
Complexity Analysis of Binary Search Tree Search
  • Best Case: The element is found at the root node.
    • Constant time complexity: O(1)O(1)
  • Assumptions:
    • The tree is correctly ordered.
    • The tree is balanced.
  • Worst Case: The element is at the deepest level of the tree.
    • Logarithmic time complexity: O(logn)O(log n), where nn is the number of nodes.
  • Logarithms:
    • If bc=ab^c = a, then logb(a)=clog_b(a) = c
    • Height of a balanced binary search tree is approximately loglog of the size of the tree.
  • Balanced Tree Properties:
    • A fully balanced tree of height hh has 2h12^h - 1 elements.
Comparison
  • Binary search trees offer a significant improvement over lists for searching, especially for large datasets.
  • For 20,000,000 items, the number of steps is dramatically reduced using a binary search tree compared to a list.

Sorting

Introduction to Sorting
  • Sorting is the process of arranging items in a specific order.
  • Linear arrangement analogous to arranging books on library shelves.
  • Michael's experience of arranging books on a trolley.
Insertion Sort
  • Simple and obvious algorithm.
  • Analogy to shelving a single book in a library.
Inserting into a Sorted List
insert :: Ord a => a -> [a] -> [a]
insert x [] = [x]
insert x (y:ys)
  | x <= y    = x : y : ys
  | otherwise = y : insert x ys
Complexity Analysis of Insertion into Sorted List
  • Best Case: The element is inserted at the beginning.
    • Constant time complexity: O(1)O(1)
  • Worst Case: The element is inserted at the end.
    • Linear time complexity: O(n)O(n)
Insertion Sort Implementation
  • Uses foldr to repeatedly insert elements into a sorted list.
isort :: Ord a => [a] -> [a]
isort = foldr insert []
Complexity Analysis of Insertion Sort
  • Best Case: The list is already sorted.
    • Linear time complexity: O(n)O(n)
  • Worst Case: The list is in reverse order.
    • Quadratic time complexity: O(n2)O(n^2)