big o notation
algorithm efficiency
think about how to design efficient algorithms
need a universal standard for measuring algorithm efficiency
consider 2 pieces of code to compare
one is run very fast on a well maintainted computer
another is run on an old slow computer
standard measure
the relationship between the increase in the size of a problem and the increase in the running time is platform independent
no matter what platform it is run on, the same relationship will emerge
because the platform isnt important, this is useful for defining algorithm efficiency
knowing the relationship is very useful for predicting how long an algorithm will take to run on a particular problem
big o notation
we use the big o nation to describe this ratio
example:
100ms on a laptop, 10ms on a supercomputer
we want a way to describe the rate with which the running time of the algorithm increases compared to the rate at which the size of the problem (n) increases
running time relative to problem size
big o is concerned with worst case time requirement
o(n) - linear time
running time increasee proportional to the rate at which the porblem size increases
example: print all the values in a list
printing 10 times will be 10 operations etc
o() - quadratic time
running time increaes proportional to the squre of the input size
example: bubble sort
input size of 2 is 4 operations, size 4 is 16 operations, size 8 is 64 operations
o(1) - constant time
running time does not depend on the size of the input
example: print the first item in an array
o(log n) - logarithmic time
running time increases logarithmically to the rate at which the size increases
example: binary search
insertion in an unordered array - o(1)
the running time of the insertion into an unordered array doesnt depend on the size of the array - we just stick the element on at the end
example:
time (T) = some constant time (K) which wont change —> T = K
K can depend on factors such as the speed of the computer, the amount of RAM etc.
we dont care what K actually is - big o notation is only concerned with describing the relationship between the running time and the size of the problem
we just say the algorithm is o(1) - running time is unaffected by n
linear search
we have to search through all the elements in an array
on average we check half of them
so, T = K * (n/2)
because K is a constant, K /2 will still be a constant (value doesnt depend on n)
so T = K * n
this algorithm is o(n)
binary search
already shown that for a binary search interactions = (size)
therefore T = K * (n)
consider the equation:
X = 3.32 * X
incoporating the 3.32 into the K, we get T = 3.32 * (n)
3.32K is just a constant which is irrelavant to big o notation
this algoritm is o(log n)
a log relationship
each step halves the size, so the number of iterations needed to search through an array using a binary search is the number of times the size of the array can be halved
size = 2^iterations
the opposite of raising something to a power is to take its log iterations = (size)
number of steps required increases very slowly compared to increases in size - logarithmically as opposed to linearly
we express this log type relationship between array size and number of seps required by saying thatthe complexity of binary search is o(log n)
operations in an ordered array
ordered arrays are handy because we can use binary search on them, and this is o(logn)
if we want to insert or delete we have to make space / remove a space
on average, we will have to move half of the items up or down —> K * (n/2)
therefore these operations are o(n)
running times in big o notation
algorithm — running time
linear search — o(n)
binary search — o(log n)
insertion in unordered array — o(1)
insertion in ordered array — o(n)
deletion in unordered array — o(n)
deletion in ordered array — o(n)
expressing iterations in terms of n
we can look at a piece of code and derive a function f(n) which describes the number of loop steps in it
formalities
a formal mathematical defintionof big o
a function f(n) = o(g(n)) if
a positive real number c and positive integer n0 exist such that f(n) <= c*g(n) for all n>= n0
interpretation
describe how the size of a function f(n) (which describes the running time of a program) increases as n gets really huge
the biggest power of n will always dominate
accordingly, we pick this as the big o complexity g(n)
we dont care about constants
to justify that this pick is a good description of f(n), we show that f(n) is always bounded by the big o complexity g(n) (multiplied by some constant, which doesnt matter as we dont care about constants) as long as n is bigger than some valve n0
in other words, to show g(n) provides a good description of f(n) we show that f(n) <= c*g(n) for all n>= n0
usage
always use the most concise formula for the o notation
we write: 3n²+2n+5 = o(n²)
how do we know the order of the function
in order to figure out what the order of a functionis, look at the highest order of n 0
if the highest order is an n² term, then the formula is o(n²)
always put g(n) equal to this power
g(n) = n²
now choose c so that it equals the sum of all the variables in the function
if f(n) = 3n²+2n+5, then choose c to be 10 (3+2+5)
this makes it easy to show that 3n²+2n+5 < 3n²+2n²+5n²
finally, figure out what value n0 needs to have in order to make the above statemet f(n) <= c.g(n) true
formalities
the following identities hold for big o notation:
o(k * f(n)) = o(f(n))
if an algorithms complexity is multiplies, it still has the same big o notation
o(f(n)) + o(g(n)) = o(f(n) + g(n))
if we run one algorithm after the other, the complexity is added
however, if algorithm 1 is o(n²) and algorithm 2 is o(n) then o(n² + n) can be more concisely describes as o(n²)
o(f(n)) o(g(n)) = o(f(n) * g(n))
if algorithm 1 is o(n²) and algorithm 2 is o(n) and one algorithm is rn inside the other as a loop, then the big o notation is o(n³)
getting big o of a program
when trying to determine the big o notation of a computer program, look at the loop structure
statements that are run the same number of times regardless of the size of the problem are just constants
all your interested in is how increasing the size of n increases the number of iterations of the loops
increasing the size of n will only have an effect if there is a loop structure which depends on n
a single loop running n times indicates o(n)
a nested loop each running n times indicates o(n²)