Complexity Analysis

Week 11: Complexity

This week will cover more advanced topics on complexity, focusing on time complexity and its mathematical underpinnings. Assignment 2 is due next week.

Time Complexity

Time complexity is the focus, particularly the mathematics of measuring it. While space complexity and Kolmogorov complexity exist, they are not the primary focus.

Empirical Testing

Last week involved test queues, where the performance of list-based and two-list queue implementations were compared. The list-based implementation performed poorly as queue size increased, while the two-list version scaled linearly. Doubling the input size doubled the computation time for the two-list queue.

It's important to test different behaviors to determine the worst-case performance. Handling the worst-case scenario is crucial in computer engineering. Testing reveals the presence of bugs but doesn't guarantee their absence.

Functional Correctness vs. Performance

Functional correctness means the function returns the correct results. A queue implemented with lists can be functionally correct but have poor time complexity, which can be considered a bug. Space complexity refers to the amount of memory consumed. The space complexity of the list and double-list queues was similar, but the time complexity varied significantly.

Complexity refers to the cost, either in time or space. Time is the duration of computation, and space is the memory consumed.

Limitations of Testing

Testing is limited because it may not cover all important cases and is specific to the computer being used. The focus should be on the fundamentals of the algorithm, not the system's overall time complexity.

The goal is to abstract away from hardware specifics and GHC's translation to machine code.

Testing on real computer systems is challenging due to the uncertain environment (virtual machines, shared resources).

Timing code is tricky, so the focus shifts to analyzing the algorithm at the Haskell level and developing a language to describe code performance.

Algorithmic Time Complexity

Algorithmic time complexity is measured by the number of steps required to execute an algorithm for an input of size n. The goal is to determine the worst-case complexity.

Consider a function that takes 2n^2 + 4n + 13 steps for an input of size n. As n gets larger, the quadratic term dominates the cost. Therefore, the quadratic part is the most important aspect of the complexity.

Big O Notation

Big O notation describes the order of a function. If f(n) = 2n^2 + 4n + 13, then f is of order n^2, written as O(n^2).

Formally, f is in O(g) if there exist positive integers m and d such that for all n > m, f(n)

For f(n) = 2n^2 + 4n + 13, we can choose g(n) = n^2. We need to find d and m such that 2n^2 + 4n + 13

Taking d = 100 and m = 1, we have

2n^2 + 4n + 13 \leq 100n^2 for all n \geq 1.

Thus, f is in O(n^2).

F is also in O(n^3). You want the slowest-growing function that bounds f.

Time Complexity and Space Complexity

Time complexity is described in terms of a polynomial function, indicating the number of steps for an input size n. Space complexity is similarly described but focuses on the amount of memory consumed.

After deriving these formulas, Big O notation is used for abstraction. For polynomials, only the highest degree term is significant.

Common Time Complexities

O(1): Constant time
O()log n): Logarithmic time
O(n): Linear time
O(n \log n): n \log n time
O(n^2): Quadratic time
O(2^n): Exponential time

Logarithmic time is highly efficient, often seen in binary search trees and database searches, where not all data is examined. Constant time, denoted as O(1), is ideal for atomic operations like cons in Haskell or integer addition.

Exponential time, or O(2^n), is the worst-case scenario. An example of an exponential time algorithm is the naive recursive implementation of the Fibonacci sequence.

Graphs illustrate the relationships between these complexities.

Importance of Order Information

Order information is crucial for understanding the behavior of algorithms as input sizes grow. Quadratic time algorithms are dangerous for large inputs, while linear time is generally desirable.

It's important to communicate the time and space complexity of your code to users.

Worst-Case Scenarios

Always consider the worst-case behavior of your code. The naive queue implementation is O(n^2) in the worst case, which can be problematic for large queues.

Example: Fibonacci Sequence

A naive recursive implementation of the Fibonacci sequence exhibits exponential time complexity.

fib :: Int -> Int
fib n
  | n <= 0    = 1
  | n == 1    = 1
  | otherwise = fib (n - 1) + fib (n - 2)

Each call to fib n results in two more calls to fib, leading to an exponential explosion of computations.

List Append

The ++ operator (list append) has a time complexity of O(n), where n is the length of the first list. Appending to a list involves creating n new cons cells.

Enqueue Operation

Repeated calls if you enqueue 100 elements into a list, then you have called append a hundred times.