big o notation

algorithm efficiency

  • think about how to design efficient algorithms

  • need a universal standard for measuring algorithm efficiency

  • consider 2 pieces of code to compare

    • one is run very fast on a well maintainted computer

    • another is run on an old slow computer

standard measure

  • the relationship between the increase in the size of a problem and the increase in the running time is platform independent

  • no matter what platform it is run on, the same relationship will emerge

  • because the platform isnt important, this is useful for defining algorithm efficiency

  • knowing the relationship is very useful for predicting how long an algorithm will take to run on a particular problem

big o notation

  • we use the big o nation to describe this ratio

  • example:

    • 100ms on a laptop, 10ms on a supercomputer

  • we want a way to describe the rate with which the running time of the algorithm increases compared to the rate at which the size of the problem (n) increases

  • running time relative to problem size

  • big o is concerned with worst case time requirement

o(n) - linear time

  • running time increasee proportional to the rate at which the porblem size increases

  • example: print all the values in a list

  • printing 10 times will be 10 operations etc

o(n2n^2) - quadratic time

  • running time increaes proportional to the squre of the input size

  • example: bubble sort

  • input size of 2 is 4 operations, size 4 is 16 operations, size 8 is 64 operations

o(1) - constant time

  • running time does not depend on the size of the input

  • example: print the first item in an array

o(log n) - logarithmic time

  • running time increases logarithmically to the rate at which the size increases

  • example: binary search

insertion in an unordered array - o(1)

  • the running time of the insertion into an unordered array doesnt depend on the size of the array - we just stick the element on at the end

  • example:

    • time (T) = some constant time (K) which wont change —> T = K

    • K can depend on factors such as the speed of the computer, the amount of RAM etc.

    • we dont care what K actually is - big o notation is only concerned with describing the relationship between the running time and the size of the problem

    • we just say the algorithm is o(1) - running time is unaffected by n

linear search

  • we have to search through all the elements in an array

  • on average we check half of them

  • so, T = K * (n/2)

  • because K is a constant, K /2 will still be a constant (value doesnt depend on n)

  • so T = K * n

  • this algorithm is o(n)

binary search

  • already shown that for a binary search interactions = log2\log_2 (size)

  • therefore T = K * log2\log_2 (n)

  • consider the equation:

    • log10\log_{10} X = 3.32 * log2\log_2 X

  • incoporating the 3.32 into the K, we get T = 3.32 * log10\log_{10} (n)

  • 3.32K is just a constant which is irrelavant to big o notation

  • this algoritm is o(log n)

a log relationship

  • each step halves the size, so the number of iterations needed to search through an array using a binary search is the number of times the size of the array can be halved

size = 2^iterations

  • the opposite of raising something to a power is to take its log iterations = log2\log_2 (size)

  • number of steps required increases very slowly compared to increases in size - logarithmically as opposed to linearly

  • we express this log type relationship between array size and number of seps required by saying thatthe complexity of binary search is o(log n)

operations in an ordered array

  • ordered arrays are handy because we can use binary search on them, and this is o(logn)

  • if we want to insert or delete we have to make space / remove a space

  • on average, we will have to move half of the items up or down —> K * (n/2)

  • therefore these operations are o(n)

running times in big o notation

algorithm — running time

linear search — o(n)

binary search — o(log n)

insertion in unordered array — o(1)

insertion in ordered array — o(n)

deletion in unordered array — o(n)

deletion in ordered array — o(n)

expressing iterations in terms of n

  • we can look at a piece of code and derive a function f(n) which describes the number of loop steps in it

formalities

  • a formal mathematical defintionof big o

  • a function f(n) = o(g(n)) if

    • a positive real number c and positive integer n0 exist such that f(n) <= c*g(n) for all n>= n0

interpretation

  • describe how the size of a function f(n) (which describes the running time of a program) increases as n gets really huge

  • the biggest power of n will always dominate

  • accordingly, we pick this as the big o complexity g(n)

  • we dont care about constants

  • to justify that this pick is a good description of f(n), we show that f(n) is always bounded by the big o complexity g(n) (multiplied by some constant, which doesnt matter as we dont care about constants) as long as n is bigger than some valve n0

  • in other words, to show g(n) provides a good description of f(n) we show that f(n) <= c*g(n) for all n>= n0

usage

  • always use the most concise formula for the o notation

  • we write: 3n²+2n+5 = o(n²)

how do we know the order of the function

  1. in order to figure out what the order of a functionis, look at the highest order of n 0

  • if the highest order is an n² term, then the formula is o(n²)

  • always put g(n) equal to this power

    • g(n) = n²

  1. now choose c so that it equals the sum of all the variables in the function

  • if f(n) = 3n²+2n+5, then choose c to be 10 (3+2+5)

  • this makes it easy to show that 3n²+2n+5 < 3n²+2n²+5n²

  1. finally, figure out what value n0 needs to have in order to make the above statemet f(n) <= c.g(n) true

formalities

  • the following identities hold for big o notation:

  1. o(k * f(n)) = o(f(n))

  • if an algorithms complexity is multiplies, it still has the same big o notation

  1. o(f(n)) + o(g(n)) = o(f(n) + g(n))

  • if we run one algorithm after the other, the complexity is added

  • however, if algorithm 1 is o(n²) and algorithm 2 is o(n) then o(n² + n) can be more concisely describes as o(n²)

  1. o(f(n)) o(g(n)) = o(f(n) * g(n))

  • if algorithm 1 is o(n²) and algorithm 2 is o(n) and one algorithm is rn inside the other as a loop, then the big o notation is o(n³)

getting big o of a program

  • when trying to determine the big o notation of a computer program, look at the loop structure

  • statements that are run the same number of times regardless of the size of the problem are just constants

  • all your interested in is how increasing the size of n increases the number of iterations of the loops

  • increasing the size of n will only have an effect if there is a loop structure which depends on n

    • a single loop running n times indicates o(n)

    • a nested loop each running n times indicates o(n²)