Lecture Notes on Laufzeitanalyse (Runtime Analysis)

Administrative Reminders

  • Mandatory Registration for Exercises:

    • Registration for exercise groups is mandatory in Moodle. Registration in Tukan is not sufficient.
    • Failure to register in Moodle results in the inability to submit homework, earn points, and potentially pass the course.
    • The original deadline was April 25th, but some students have still not registered.
    • Do not change exercise groups in Tukan, as it overrides the Moodle selection, potentially unregistering you.
    • Verify Moodle registration to ensure you are in an exercise group.
  • Homework Deadline:

    • The deadline for the first theoretical homework is May 9th at 23:59.
    • Late submissions will not be accepted.
  • Prioritization of Inquiries:

    • Due to numerous inquiries regarding exercise group registration, the focus will now be on content-related support.
    • Requests for exercise group enrollment after Wednesday, May 7th, may not be addressed until after the first homework deadline.

Laufzeitanalyse (Runtime Analysis)

  • Motivation:

    • Quantify the number of steps an algorithm takes based on input complexity.
    • Input complexity refers to the size or magnitude of the input (n).
    • Focus on how the algorithm scales with increasing input size.
  • Worst-Case Laufzeit (Worst-Case Runtime):

    • Typically, the worst-case scenario is considered, where the input requires the longest processing time.
    • The worst-case runtime ensures that the algorithm remains manageable for all possible inputs.
    • It represents the maximum number of steps for a given input size (n).
  • Laufzeit von Insertion Sort (Runtime of Insertion Sort):

    • Each line of code is assigned a cost (cic_i).
    • The goal is to determine how often each line of code is executed.
    • Outer loop executes n1n-1 times (or nn times to account for loop termination).
  • Inner Loop Analysis (Lines 4-6):

    • The worst-case scenario occurs when each value must be moved to the beginning of the sorted portion.
    • The inner loop (lines 5 and 6) can execute a maximum of n(n1)2\frac{n(n-1)}{2} times.
    • Line 4 executes n1n-1 additional times due to the loop condition check.
  • Total Runtime Calculation:

    • T(n)=c<em>1n+(c</em>2+c<em>3+c</em>7)(n1)+c<em>4(n(n1)2+n1)+(c</em>5+c6)n(n1)2T(n) = c<em>1 \cdot n + (c</em>2 + c<em>3 + c</em>7) \cdot (n-1) + c<em>4\cdot (\frac{n(n-1)}{2} + n -1) + (c</em>5 + c_6) \cdot \frac{n(n-1)}{2}
    • Simplification involves summing up similar terms based on the number of times they are called.
  • Berechnungsmodell (Computational Model):

    • A standard CPU with Random Access Memory (RAM) is assumed.
    • All elementary operations (lines of code) are assumed to take approximately the same time (cost of 1).
  • Dominante Terme (Dominant Terms):

    • The term that depends quadratically on n is the most relevant for large n.
    • Linear terms become negligible compared to quadratic terms as n increases.
  • Vereinfachung (Simplification):

    • The constant factor is also ignored, making the runtime approximately n2n^2.
    • This approximation is independent of the specific hardware and focuses on the scaling behavior.

Landau-Symbole / O-Notation (Landau Symbols / Big O Notation)

  • Definition:

    • Used to classify the complexity of functions.
    • Complexity refers to how the runtime scales with input size (n).
  • Theta Notation (Θ(g(n))):

    • A function f(n)f(n) is in Θ(g(n))\Theta(g(n)) if it's bounded both from above and below by a multiple of g(n)g(n) for sufficiently large nn.
    • Formally: There exist positive constants c<em>1c<em>1, c</em>2c</em>2, and n<em>0n<em>0 such that for all nn</em>0n \ge n</em>0, c<em>1g(n)f(n)c</em>2g(n)c<em>1 \cdot g(n) \le f(n) \le c</em>2 \cdot g(n).
    • Intuitively: f(n)f(n) and g(n)g(n) grow at the same rate.
  • Notation:

    • Technically, f(n)Θ(g(n))f(n) \in \Theta(g(n)), indicating membership in the set Θ(g(n))\Theta(g(n)).
    • Often written as f(n)=Θ(g(n))f(n) = \Theta(g(n)) in computer science literature, which is equivalent to f(n)Θ(g(n))f(n) \in \Theta(g(n)).
  • Graphical Interpretation:

    • For large nn values (greater than some threshold n<em>0n<em>0), f(n)f(n) is always between c</em>1g(n)c</em>1 \cdot g(n) and c2g(n)c_2 \cdot g(n).
  • Asymptotically Tight Bound:

    • Θ(g(n))\Theta(g(n)) provides a tight bound on the growth rate of f(n)f(n) both from above and below.
  • Example: Proving Insertion Sort is Θ(n²):

    • Given T(n)=4n+4(n1)+3n(n1)2T(n) = 4n + 4(n-1) + 3\frac{n(n-1)}{2} (a specific runtime),
    • Claim: T(n)Θ(n2)T(n) \in \Theta(n^2).
  • Lower Bound:

    • Choose n<em>0=2n<em>0 = 2 and c</em>1=32c</em>1 = \frac{3}{2}.
    • Simplify T(n) (e.g., T(n)5(n1)+32(n2n)T(n) \ge 5(n-1) + \frac{3}{2}(n^2 -n)).
    • Show that T(n)32n2T(n) \ge \frac{3}{2}n^2 for all n2n \ge 2.
  • Upper Bound:

    • Choose n<em>0=2n<em>0 = 2 and c</em>2=7c</em>2 = 7.
    • Simplify T(n) (e.g., T(n)5n+3n22T(n) \le 5n + 3\frac{n^2}{2}… )
    • Show that T(n)7n2T(n) \le 7n^2 for all n2n \ge 2.
  • Conclusion:

    • Since T(n)T(n) is bounded both above and below by a constant multiple of n2n^2 for n2n \ge 2, T(n)Θ(n2)T(n) \in \Theta(n^2).

Fragen und Antworten (Questions and Answers)

  • How to find c<em>1c<em>1 and c</em>2c</em>2?

    • One strategy is to overestimate (upper bound) each term and find a c2c_2 that works.
    • For finding c1c_1 (lower bound) guess and check and work to show your conclusion.
  • Why upper and lower bounds?

    • O(n2)O(n^2) and Θ(n2)Θ(n^2) tells us how the runtime grows and if the algorithm is efficient in comparison to others.

O-Notation (Big O Notation)

  • Definition

    • Oftentimes written as O() but with a normal O.
    • Represents an upper asymptotic bound.
    • The set O(g) is the set of all functions f for which there exist positive constants c and n0, such that for all n >= n0, f(n) is bounded above by c * g(n).
    • You can have the runtime be slower in complexity than big O.
    • Also for big O, there's only one constant (the value c) that must be found.
  • Big O graphic interpretation

    • The same as theta, but you only put one line above for values greater than n0n_0.
  • O Notation importance

    • If a function is in theta(g), then it is automatically in O(g).
    • The opposite is not true.
    • Theta(g) is a subset of (not equals) O(g).

Rules for the Big O Notation

  • If a function f is just the constant a, and a is positive and from the real numbers, then the function is automatically from O(1). This shows that asympotitcally, the runtime does not grow.
  • If the scalar multiple A * f is in the Big O notation, then the Big O of G is also in big 0 of G. The scalar A can just extend the bounding line already there.
  • Let's assume one Big O algorithm of g one, and a different algorithm of g 2. Now, we can then say that these are added together if one is given big 0 of G1, and the other is big O of G2. Then, their total runtime/ sum is basically in big O max of g1 and g2. You just have to take whichever has a point wise maximum and is the largest.
    • Then, if F1 is o of n, and F2 is n squared, then it is just big o of n squared.
  • We multiply the times if in one function f2 is run by some g2, and it is run f1 by some g1. Then, we know that we simple have to multiply the runtime.

Question to the Big O notation.

  • Can big O be arbitrarily generous? The notation requires the lowest, or slowly growing function possible. We should always try to find the smallest, slowest one.
  • The reason to be is because algorithm builders are efficient algorithm builder, and we try to achieve that goal.

Omega (Big Notation)

  • This is the analog for the lower asymptotic shackle / minimum bound.
  • T says that a function is in big omega of g if there's a c and n zero such that for all n greater than n 0, f of n must be bottom up from c times g of n.
  • Big O defines upper bound, while big omega defines the lower.
  • It must be noted, Omega(g) does NOT mean the runtime that is required to run this function!
  • It simply means that in a runtime, the notation w grows asymptoticly the value g function.
  • Visually, a Omega line from the bottom, will stay under the function you are measuring as values grow along n, after crossing and breaking at most one time.
  • It must be said, if f(n) is theta of g, then it must be from omega of G. If F grows no slower rate with the increasing G function.

Calculations in Big O, and its functions

  • We can, without change, implement the same functions that existed with big o, for calculations in Big (omega) notation.

Important fact.

  • bigo (on top of function), then that funtion must be automaticly also on the bottom, Omega big}.
  • If F is in Omega N, then F can also be in Omega 1.

Important consideration with Big O/Omega Notation.

  • The consideration is the subset is important, especially when reading from left to right (not right to left).
  • Therefore, f has to be inside func memmber. Then can compare them (with g).
  • We also get to compare them as estimations as we change them, but its important what direction the guesstemation will point.
  • As well, the worst case Laufzeit has a Theta(n) quadratic Laufzeit, at least a little bit.
  • When things are already correctly sorted, then asymoptically this function scales only at the O(n) scale

General Considerations

  • Theta (1) is concidered constant runtime / will run at the first instance of access. Therefore there is no dependency, at call time, from the actual imput.
  • Big O / THETA (log n). Binary search! Use midpoint and split! Great speed.
  • Big O / Theta(n): Linear. Going through all the data/elements with a loop.
  • N (squared): Matrizen edition uses this one. Need 2 elements to add with this consideration.
  • There is also N log (n) - basically linear that is quick. Array sorting uses this. Fast and quick!
  • There are also big values, like when they get to a exponent, or factorial.

Other Big O / Big Omega Considerations.

  • The other is based on having a similar function, bit smaller in numbers, since they are still valid.
  • You can, for all constant c - you an find a n small enough such than your function is indeed, faster. - then it is called Big O of it.
  • For all the constant value you can find, one still has to be bigger. Not very used! but must be knew!!

Important points to the runtime considerations / Questions

  • The Bubble Sort. Iterate and work to see all other indexes, from 0 to leng minus one. For those small indexes, you work to compare and vertaushe them. AKA vertaushe them, and then look at another number until your list is sorted.
  • The algorithms for this approach are based on two loops. It's just quadratic!!!!
  • In the best case, it is still also quadratic! The algorithm has to loop through EVERYTHING , ALWAYS.