Amortized Analysis Notes

Amortized Analysis

Material from Chapter 11 of DSAA.
Splay trees.
Data structure of storing disjoint sets.
Incomplete.
Linear time suffix tree construction.
Chapter 17 of CLRS 2nd Edition.
Accounting method.
Aggregate method.
List update problem based on Erik Demaine’s lectures as an example of the potential function method.

Analyzing Calls to a Data Structure

All algorithms involve repeated calls to one or more data structures.
When analyzing the running time of an algorithm, one needs to sum up the time spent in all the calls to the data structure.
Problem: If different calls take different times, how can we accurately calculate the total time over the request sequence of calls?

Example - Max-Heap Recall

A max-heap is a nearly complete binary tree (i.e., all levels except the deepest level are completely filled, and the last level is filled from the left) satisfying the heap property: if B is a child of a node A, then key[A] >= key[B].

Adding an Element to a Heap

An element can be added to the heap as follows:

Add the element on the bottom level of the heap.
Compare the added element with its parent; if they are in the correct order, stop.
If not, swap the element with its parent and return to the previous step.

Constructing a Heap

Let us form a heap of n elements from scratch.

First idea:
- Use n times add to form the heap.
- Each addition to the heap operates on a heap with at most n elements.
- Adding to a heap with n elements takes $O(log n)$ time
- Total time spent doing the n insertions is $O(n log n)$ time

Constructing a Heap (2)

Two questions arise:

Does our analysis overestimate the time?
The different insertions take different amounts of time, and many are on smaller heaps. (=> leads to amortized analysis)
Is this the optimal way to create a heap?
Perhaps simply adding n times is not the best way to form a heap.

Constructing a Heap (3)

Second idea:
- Place elements in an array, interpret as a binary tree.
- Look at subtrees at height h (measured from the lowest level).
- If these trees have been heapified, then subtrees at height h+1 can be heapified by sending their roots down.
- Initially, the trees at height 0 are all heapified.

Constructing a Heap (4)

Array of length n.
Number of nodes at height h is at most $floor(n/2^{h+1})$ .
Cost to heapify a tree at height h+1 if all subtrees have been heapified: $O(h)$ swaps.
Total cost:

Amortized Analysis

Purpose is to accurately compute the total time spent in executing a sequence of operations on a data structure.
Three different approaches:
- Aggregate method: Combinatorial like the heap analysis.
- Accounting method: Assign costs to each operation so that it is easy to sum them up while still ensuring that the result is accurate.
- Potential method: A sophisticated version of the accounting method.

Example: Augmented Stack S

Operations are:
- Push(S,x)
- Pop(S)
- Multipop(S,k) - pop the top k elements
Implement with either array or linked list
- time for Push is $O(1)$
- time for Pop is $O(1)$
- time for Multipop is $O(min(|S|,k))$

Example: k-Bit Counter A

Operation:
- increment(A) - add 1 (initially 0)
Implementation:
- k-element binary array
- use school ripple-carry algorithm

Aggregate Method

Show that a sequence of n operations takes $T(n)$ time
We can then say that the amortized cost per operation is $T(n)/n$
Makes no distinction between operation types

Augmented Stack: A Simple Worst Case Argument

In a sequence of n operations, the stack never holds more than n elements.
Thus, the cost of a multipop is $O(n)$
Therefore, the worst-case cost of any sequence of n operations is $O(n^2)$ .
But this is an over-estimate!

Aggregate Method for Augmented Stack

Key idea: total number of pops (or multipops) in the entire sequence of operations is at most the total number of pushes
Suppose that the maximum number of Push operations in the sequence is n.
So time for the entire sequence is $O(n)$ .
Amortized cost per operation: $O(n)/n = O(1)$ .

Aggregate Method for k-Bit Counter

Worst-case time for an increment is O(k). This occurs when all k bits are flipped
But in a sequence of n operations, not all of them will cause all k bits to flip:
- bit 0 flips with every increment
- bit 1 flips with every 2nd increment
- bit 2 flips with every 4th increment
- …
- bit k flips with every $2^{k-}$ th increment

Aggregate Method for k-Bit Counter

Total number of bit flips in n increment operations is
n + n/2 + n/4 + … + n/2^k < n(1/(1-1/2))= 2n
So the total cost of the sequence is O(n).
Amortized cost per operation is $O(n)/n = O(1)$ .

Accounting Method

Assign a cost, called the "amortized cost", to each operation.
Assignment must ensure that the sum of all the amortized costs in a sequence is at least the sum of all the actual costs.
Remember, we want an upper bound on the total cost of the sequence H thi t?

Accounting Method

For each operation in the sequence:
- if amortized cost > actual cost then store extra as a credit with an object in the data structure.
- if amortized cost < actual cost then use the stored credits to make up the difference.
Must have enough credit saved up to pay for any future underestimates.

Accounting Method vs. Aggregate Method

Aggregate method:
- first analyze entire sequence
- then calculate amortized cost per operation
Accounting method:
- first assign amortized cost per operation
- check that they are valid
- then compute cost of entire sequence of operations

Accounting Method for Augmented Stack

Assign these amortized costs:
- Push - 2
- Pop - 0
- Multipop - 0
For Push, the actual cost is 1. Store the extra 1 as a credit, associated with the pushed element.
Pay for each popped element (either from Pop or Multipop) using the associated credit.

Accounting Method for Augmented Stack

There is always enough credit to pay for each operation.
Each amortized cost is O(1)
So the cost of the entire sequence of n operations is O(n).

Accounting Method for k-Bit Counter

Assign the amortized cost for increment operation to be 2.
The actual cost is the number of bits flipped:
- a series of 1's are reset to 0
- then a 0 is set to 1
Idea: 1 is used to pay for flipping a 0 to 1. The extra 1 is stored with the bit to pay for the next change (when it

Accounting Method for k-Bit Counter

All changes from 1 to 0 are paid for with previously stored credit
Amortized time per operation is O(1)
total cost of sequence is O(n)

Conclusions

Amortized Analysis allows one to estimate the cost of a sequence of operations on data structures. The method is typically more accurate than worst-case analysis when the data structure is dynamically changing.

Self-organizing lists

List L of n elements.
The operation ACCESS(x) costs rankL(x) = distance of x from the head of L.
L can be reordered by transposing adjacent elements at a cost of 1.

Self-organizing lists

List L of n elements.
The operation ACCESS(x) costs rankL(x) = distance of x from the head of L.
L can be reordered by transposing adjacent elements at a cost of 1.

Example:

Accessing the element with key 14 costs 4.

Example:

Self-organizing lists- application of amortized analysis

List L of n elements
The operation ACCESS(x) costs rankL(x) = distance of x from the head of L.
L can be reordered by transposing adjacent elements at a cost of 1.
Example:
Transposing 3 and 50 costs 1.

Worst-case analysis of self- organizing lists

Let us think of S being generated by an adversary. An adversary always accesses the tail (nth) element of L. Then, for any on-line algorithm A, we have $C_A(S) = \Omega(|S| \cdot n)$

in the worst case. But what would our OPT do on S?

Average-case analysis of self- organizing lists

Suppose that element x is accessed with probability p(x). Then, we have, which is minimized when L is sorted in decreasing order with respect to p.
Heuristic: Keep a count of the number of times each element is accessed, and maintain L in order of decreasing count.

Application of amortized analysis

Can we use an algorithm to organize the lists that are as good as possible on all update sequences?

But we will say "We don’t know what opt will do."
We aim to design an algorithm for which the cost of the update on each sequence is guaranteed to be competitive with the best update for that sequence
We quantify the performance against an algorithm that is given the sequence first and told that it has to service it optimally.

On-line and off-line problems

Definition. A sequence S of operations is provided one at a time. For each operation, an on-line algorithm A must execute the operation immediately without any knowledge of future operations (e.g., Tetris).

An off-line algorithm may see the whole sequence S in advance.
Goal: Minimize the total cost CA(S).
The game of Tetris

On-line and off-line problems

An off-line algorithm may see the whole sequence S in advance. This algorithm is well-defined. On each input sequence, it explores all the possible reorganizations and their costs, and selects the best one to service the sequence. Cost of OPT: denoted it by COPT(S).

The game of Tetris

The move-to-front heuristic

Practice: Implementers discovered that the move-to-front (MTF) heuristic empirically yields good results.

IDEA: After accessing x, move x to the head of L using transposes: cost = $2 \cdot rankL(x)$ .
The MTF heuristic responds well to locality in the access sequence S.

Competitive analysis

How competitive is MTF in comparison to the OPT algorithm?
Definition. An on-line algorithm A is α-competitive if there exists a constant k such that for any sequence S of operations, $CA(S) \le \alpha \cdot C{OPT}(S) + k$ , where OPT is the optimal off-line algorithm (“God’s algorithm”).

MTF is O(1)-competitive

Theorem. MTF is 4-competitive for self- organizing lists.

MTF is O(1)-competitive

Theorem. MTF is 4-competitive for self- organizing lists.

How useful is this from a data structure point of view?
For each sequence S, the performance of this very simple idea is only 4 times worse than the list maintenance algorithm that aims to minimize the access cost into the list.
After this, we will take the same approach for BSTs Fix the request sequence S during the analysis

MTF is O(1)-competitive

Theorem. MTF is 4-competitive for self- organizing lists.

Proof. Let Li be MTF’s list after the ith access, and let Li* be OPT’s list after the ith access.

Let ci = MTF’s cost for the ith operation = $2 \cdot rank{Li–1}(x)$ if it accesses x; ci* = OPT’s cost for the ith operation = $rank{Li–1*}(x) + t_i$ , where ti is the number of transposes that OPT performs.

What happens on an access?

On access of element x what do the lists look like

MTF has an access cost of |A| + |B| + 1
OPT has an access cost of |A| + |C| + 1
What is the difference between |B| and |C|?
How can we analyze this?
Note that B and C are “inverted” Let us remember that OPT is reorganizing as per a fixed strategy which is the best for this sequence.
- $A \cup B x C \cup D$
- $A \cup C x B \cup D$
- $L_{i–1}$
- $L_{i–1*}$

Potential function

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.

Observe that $L_i^*$ is a well-defined list for the sequence S that MTF is compared against
We define the amortized cost using the potential function

Potential function – how different are the two lists

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.

Example.
* $Li \quad E C A D B$ * $Li^* \quad C A B D E$

Potential function

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.

Example.
* $Li \quad E C A D B$ * $Li^* \quad C A B D E$
* $\Phi(L_i)= 2 \cdot |{\dots}|$

Potential function

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.

Example.
* $Li \quad E C A D B$ * $Li^* \quad C A B D E$
* $\Phi(L_i)= 2 \cdot |{(E,C), \dots}|$

Potential function

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.

Example.
* $Li \quad E C A D B$ * $Li^* \quad C A B D E$
* $\Phi(L_i)= 2 \cdot |{(E,C), (E,A), \dots}|$

Potential function

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.

Example.
* $Li \quad E C A D B$ * $Li^* \quad C A B D E$
* $\Phi(L_i)= 2 \cdot |{(E,C), (E,A), (E,D), \dots}|$

Potential function

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.

Example.
* $Li \quad E C A D B$ * $Li^* \quad C A B D E$
* $\Phi(L_i)= 2 \cdot |{(E,C), (E,A), (E,D), (E,B), \dots}|$

Potential function

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.

Example.
* $Li \quad E C A D B$ * $Li^* \quad C A B D E$
* $\Phi(L_i)= 2 \cdot |{(E,C), (E,A), (E,D), (E,B), (D,B)}|$

Potential function

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li) = 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}. Example.
- $L_i \quad E C A D B$
- $L_i^* \quad C A B D E$
- $\Phi(L_i) = 2 \cdot |{(E,C), (E,A), (E,D), (E,B), (D,B)}| = 10 .$

Potential function

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li) = 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.

Potential function

Define the potential function $\Phi:{Li} \rightarrow R$ by \Phi(Li) = 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}. Note that
- $\Phi(L_i) \ge 0$ for i = 0, 1, …,,
- \Phi(L_0)  0 if MTF and OPT start with the same list.

Potential function

How much does $\Phi$ change from 1 transpose?
- A transpose creates/destroys 1 inversion.
- \Delta\Phi \pm 2
  - $A \cup B x C \cup D$
  - $A \cup C x B \cup D$
  - $L_{i–1}$
  - $L_{i–1*}$

What happens on an access?

Suppose that operation i accesses element x, and define

$A = {y \in L{i–1} : y L{i–1}x \text{ and } y L_{i–1*} x}$ ,
$B = {y \in L{i–1} : y L{i–1}x \text{ and } y L_{i–1*} x}$ ,
$C = {y \in L{i–1} : y L{i–1}x \text{ and } y L_{i–1*} x}$ ,
$D = {y \in L{i–1} : y L{i–1}x \text{ and } y L_{i–1*} x}$ .
- $A \cup B x C \cup D$
- $A \cup C x B \cup D$
- $L_{i–1}$
- $L_{i–1*}$

What happens on an access?

r = $rank_{Li–1}(x)$
r= $rank_{Li–1}(x)$
We have r = |A| + |B| + 1 and r*  |A| + |C| + 1.

What happens on an access?

We have r = |A| + |B| + 1 and r*  |A| + |C| + 1.
r = $rank_{Li–1}(x)$
r= $rank_{Li–1}(x)$
- When MTF moves x to the front, it creates |A| inversions and destroys |B| inversions.
- Each transpose by OPT creates $\le$ 1 inversion.
- Thus, we have $\Phi(Li) – \Phi(L{i–1}) \le 2(|A| – |B| + t_i)$ .

Amortized cost

$\hat{c}i = ci + \Phi(Li) – \Phi(L{i–1})$
The amortized cost for the i-th operation of MTF with respect to $\Phi$ is

Amortized cost

$\hat{c}i = ci + \Phi(Li) – \Phi(L{i–1}) \le 2r + 2(|A| – |B| + t_i)$
The amortized cost for the i-th operation of MTF with respect to $\Phi$ is

Amortized cost

$\hat{c}i = ci + \Phi(Li) – \Phi(L{i–1}) \le 2r + 2(|A| – |B| + t_i)$
= 2r + 2(|A| – (r – 1 – |A|) + ti)
The amortized cost for the ith operation of MTF with respect to $\Phi$ is (since r = |A| + |B| + 1)

Amortized cost

$\hat{c}i = ci + \Phi(Li) – \Phi(L{i–1}) \le 2r + 2(|A| – |B| + t_i)$
= 2r + 2(|A| – (r – 1 – |A|) + ti)
= 2r + 4|A| – 2r + 2 + 2ti
The amortized cost for the ith operation of MTF with respect to $\Phi$ is

Amortized cost

$\hat{c}i = ci + \Phi(Li) – \Phi(L{i–1}) \le 2r + 2(|A| – |B| + t_i)$
= 2r + 2(|A| – (r – 1 – |A|) + ti)
= 2r + 4|A| – 2r + 2 + 2ti
= 4|A| + 2 + 2ti
The amortized cost for the ith operation of MTF with respect to $\Phi$ is

Amortized cost

$\hat{c}i = ci + \Phi(Li) – \Phi(L{i–1}) \le 2r + 2(|A| – |B| + t_i)$
= 2r + 2(|A| – (r – 1 – |A|) + ti)
= 2r + 4|A| – 2r + 2 + 2ti
= 4|A| + 2 + 2ti
$\le 4(r* + t_i)$
The amortized cost for the ith operation of MTF with respect to $\Phi$ is (since r* = |A| + |C| + 1 $\ge$ |A| + 1)

Amortized cost

$\hat{c}i = ci + \Phi(Li) – \Phi(L{i–1}) \le 2r + 2(|A| – |B| + t_i)$
= 2r + 2(|A| – (r – 1 – |A|) + ti)
= 2r + 4|A| – 2r + 2 + 2ti
= 4|A| + 2 + 2ti
\le 4(r* + ti)  4ci*.
The amortized cost for the ith operation of MTF with respect to $\Phi$ is

Bounding the actual cost

*Thus, we have
$\sum{i=1}^{|S|} C{MTF} (S) = \sum{i=1}^{|S|} Ci$

In terms of the amortized cost

Thus, we have

$\sum{i=1}^{|S|} C{MTF} (S) = \sum{i=1}^{|S|} Ci = \sum{i=1}^{|S|} (\hat{c}i + \Phi(L{i-1}) - \Phi(Li))$

The potentials cancel out

Thus, we have

$\sum{i=1}^{|S|} C{MTF} (S) = \sum{i=1}^{|S|} Ci = \sum{i=1}^{|S|} (\hat{c}i + \Phi(L{i-1}) - \Phi(Li))$
$\sum{i=1}^{|S|} C{MTF} (S) \le \sum{i=1}^{|S|} 4 (r^*i + ti) = 4 \cdot C{OPT}(S)$

The grand finale

Thus, we have since $\Phi(L0) = 0$ and $\Phi(L{|S|}) \ge 0$ .
So all potentials cancel out, Except for first and last, and the sum of true Costs is at most the sum of amortized costs.

Addendum

If we count transpositions that move x toward the front as “free” (models splicing x in and out of L in constant time), then MTF is 2-competitive.