Amortized Analysis Notes
Amortized Analysis
Material from Chapter 11 of DSAA.
Splay trees.
Data structure of storing disjoint sets.
Incomplete.
Linear time suffix tree construction.
Chapter 17 of CLRS 2nd Edition.
Accounting method.
Aggregate method.
List update problem based on Erik Demaine’s lectures as an example of the potential function method.
Analyzing Calls to a Data Structure
All algorithms involve repeated calls to one or more data structures.
When analyzing the running time of an algorithm, one needs to sum up the time spent in all the calls to the data structure.
Problem: If different calls take different times, how can we accurately calculate the total time over the request sequence of calls?
Example - Max-Heap Recall
A max-heap is a nearly complete binary tree (i.e., all levels except the deepest level are completely filled, and the last level is filled from the left) satisfying the heap property: if B is a child of a node A, then key[A] >= key[B].
Adding an Element to a Heap
An element can be added to the heap as follows:
Add the element on the bottom level of the heap.
Compare the added element with its parent; if they are in the correct order, stop.
If not, swap the element with its parent and return to the previous step.
Constructing a Heap
Let us form a heap of n elements from scratch.
First idea:
Use n times add to form the heap.
Each addition to the heap operates on a heap with at most n elements.
Adding to a heap with n elements takes time
Total time spent doing the n insertions is time
Constructing a Heap (2)
Two questions arise:
Does our analysis overestimate the time?
The different insertions take different amounts of time, and many are on smaller heaps. (=> leads to amortized analysis)
Is this the optimal way to create a heap?
Perhaps simply adding n times is not the best way to form a heap.
Constructing a Heap (3)
Second idea:
Place elements in an array, interpret as a binary tree.
Look at subtrees at height h (measured from the lowest level).
If these trees have been heapified, then subtrees at height h+1 can be heapified by sending their roots down.
Initially, the trees at height 0 are all heapified.
Constructing a Heap (4)
Array of length n.
Number of nodes at height h is at most .
Cost to heapify a tree at height h+1 if all subtrees have been heapified: swaps.
Total cost:
Amortized Analysis
Purpose is to accurately compute the total time spent in executing a sequence of operations on a data structure.
Three different approaches:
Aggregate method: Combinatorial like the heap analysis.
Accounting method: Assign costs to each operation so that it is easy to sum them up while still ensuring that the result is accurate.
Potential method: A sophisticated version of the accounting method.
Example: Augmented Stack S
Operations are:
Push(S,x)
Pop(S)
Multipop(S,k) - pop the top k elements
Implement with either array or linked list
time for Push is
time for Pop is
time for Multipop is
Example: k-Bit Counter A
Operation:
increment(A) - add 1 (initially 0)
Implementation:
k-element binary array
use school ripple-carry algorithm
Aggregate Method
Show that a sequence of n operations takes time
We can then say that the amortized cost per operation is
Makes no distinction between operation types
Augmented Stack: A Simple Worst Case Argument
In a sequence of n operations, the stack never holds more than n elements.
Thus, the cost of a multipop is
Therefore, the worst-case cost of any sequence of n operations is .
But this is an over-estimate!
Aggregate Method for Augmented Stack
Key idea: total number of pops (or multipops) in the entire sequence of operations is at most the total number of pushes
Suppose that the maximum number of Push operations in the sequence is n.
So time for the entire sequence is .
Amortized cost per operation: .
Aggregate Method for k-Bit Counter
Worst-case time for an increment is O(k). This occurs when all k bits are flipped
But in a sequence of n operations, not all of them will cause all k bits to flip:
bit 0 flips with every increment
bit 1 flips with every 2nd increment
bit 2 flips with every 4th increment
…
bit k flips with every th increment
Aggregate Method for k-Bit Counter
Total number of bit flips in n increment operations is
n + n/2 + n/4 + … + n/2^k < n(1/(1-1/2))= 2n
So the total cost of the sequence is O(n).
Amortized cost per operation is .
Accounting Method
Assign a cost, called the "amortized cost", to each operation.
Assignment must ensure that the sum of all the amortized costs in a sequence is at least the sum of all the actual costs.
Remember, we want an upper bound on the total cost of the sequence H thi t?
Accounting Method
For each operation in the sequence:
if amortized cost > actual cost then store extra as a credit with an object in the data structure.
if amortized cost < actual cost then use the stored credits to make up the difference.
Must have enough credit saved up to pay for any future underestimates.
Accounting Method vs. Aggregate Method
Aggregate method:
first analyze entire sequence
then calculate amortized cost per operation
Accounting method:
first assign amortized cost per operation
check that they are valid
then compute cost of entire sequence of operations
Accounting Method for Augmented Stack
Assign these amortized costs:
Push - 2
Pop - 0
Multipop - 0
For Push, the actual cost is 1. Store the extra 1 as a credit, associated with the pushed element.
Pay for each popped element (either from Pop or Multipop) using the associated credit.
Accounting Method for Augmented Stack
There is always enough credit to pay for each operation.
Each amortized cost is O(1)
So the cost of the entire sequence of n operations is O(n).
Accounting Method for k-Bit Counter
Assign the amortized cost for increment operation to be 2.
The actual cost is the number of bits flipped:
a series of 1's are reset to 0
then a 0 is set to 1
Idea: 1 is used to pay for flipping a 0 to 1. The extra 1 is stored with the bit to pay for the next change (when it
Accounting Method for k-Bit Counter
All changes from 1 to 0 are paid for with previously stored credit
Amortized time per operation is O(1)
total cost of sequence is O(n)
Conclusions
Amortized Analysis allows one to estimate the cost of a sequence of operations on data structures. The method is typically more accurate than worst-case analysis when the data structure is dynamically changing.
Self-organizing lists
List L of n elements.
The operation ACCESS(x) costs rankL(x) = distance of x from the head of L.
L can be reordered by transposing adjacent elements at a cost of 1.
Self-organizing lists
List L of n elements.
The operation ACCESS(x) costs rankL(x) = distance of x from the head of L.
L can be reordered by transposing adjacent elements at a cost of 1.
Example:
Accessing the element with key 14 costs 4.
Example:
Self-organizing lists- application of amortized analysis
List L of n elements
The operation ACCESS(x) costs rankL(x) = distance of x from the head of L.
L can be reordered by transposing adjacent elements at a cost of 1.
Example:Transposing 3 and 50 costs 1.
Worst-case analysis of self- organizing lists
Let us think of S being generated by an adversary. An adversary always accesses the tail (nth) element of L. Then, for any on-line algorithm A, we have
in the worst case. But what would our OPT do on S?
Average-case analysis of self- organizing lists
Suppose that element x is accessed with probability p(x). Then, we have, which is minimized when L is sorted in decreasing order with respect to p.
Heuristic: Keep a count of the number of times each element is accessed, and maintain L in order of decreasing count.
Application of amortized analysis
Can we use an algorithm to organize the lists that are as good as possible on all update sequences?
But we will say "We don’t know what opt will do."
We aim to design an algorithm for which the cost of the update on each sequence is guaranteed to be competitive with the best update for that sequence
We quantify the performance against an algorithm that is given the sequence first and told that it has to service it optimally.
On-line and off-line problems
Definition. A sequence S of operations is provided one at a time. For each operation, an on-line algorithm A must execute the operation immediately without any knowledge of future operations (e.g., Tetris).
An off-line algorithm may see the whole sequence S in advance.
Goal: Minimize the total cost CA(S).
The game of Tetris
On-line and off-line problems
An off-line algorithm may see the whole sequence S in advance. This algorithm is well-defined. On each input sequence, it explores all the possible reorganizations and their costs, and selects the best one to service the sequence. Cost of OPT: denoted it by COPT(S).
The game of Tetris
The move-to-front heuristic
Practice: Implementers discovered that the move-to-front (MTF) heuristic empirically yields good results.
IDEA: After accessing x, move x to the head of L using transposes: cost = .
The MTF heuristic responds well to locality in the access sequence S.
Competitive analysis
How competitive is MTF in comparison to the OPT algorithm?
Definition. An on-line algorithm A is α-competitive if there exists a constant k such that for any sequence S of operations, , where OPT is the optimal off-line algorithm (“God’s algorithm”).
MTF is O(1)-competitive
Theorem. MTF is 4-competitive for self- organizing lists.
MTF is O(1)-competitive
Theorem. MTF is 4-competitive for self- organizing lists.
How useful is this from a data structure point of view?
For each sequence S, the performance of this very simple idea is only 4 times worse than the list maintenance algorithm that aims to minimize the access cost into the list.
After this, we will take the same approach for BSTs Fix the request sequence S during the analysis
MTF is O(1)-competitive
Theorem. MTF is 4-competitive for self- organizing lists.
Proof. Let Li be MTF’s list after the ith access, and let Li* be OPT’s list after the ith access.
Let ci = MTF’s cost for the ith operation = if it accesses x; ci* = OPT’s cost for the ith operation = , where ti is the number of transposes that OPT performs.
What happens on an access?
On access of element x what do the lists look like
MTF has an access cost of |A| + |B| + 1
OPT has an access cost of |A| + |C| + 1
What is the difference between |B| and |C|?
How can we analyze this?
Note that B and C are “inverted” Let us remember that OPT is reorganizing as per a fixed strategy which is the best for this sequence.
Potential function
Define the potential function by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.
Observe that is a well-defined list for the sequence S that MTF is compared against
We define the amortized cost using the potential function
Potential function – how different are the two lists
Define the potential function by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.
Example.
* *
Potential function
Define the potential function by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.
Example.
* *
*
Potential function
Define the potential function by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.
Example.
* *
*
Potential function
Define the potential function by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.
Example.
* *
*
Potential function
Define the potential function by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.
Example.
* *
*
Potential function
Define the potential function by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.
Example.
* *
*
Potential function
Define the potential function by \Phi(Li)= 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.
Example.
* *
*
Potential function
Define the potential function by \Phi(Li) = 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}. Example.
Potential function
Define the potential function by \Phi(Li) = 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}.
Potential function
Define the potential function by \Phi(Li) = 2 \cdot |{(x, y) : x Li y \text{ and } y Li* x}| = 2 \cdot # \text{inversions}. Note that
for i = 0, 1, …,,
\Phi(L_0) 0 if MTF and OPT start with the same list.
Potential function
How much does change from 1 transpose?
A transpose creates/destroys 1 inversion.
\Delta\Phi \pm 2
What happens on an access?
Suppose that operation i accesses element x, and define
,
,
,
.
What happens on an access?
r =
r=
We have r = |A| + |B| + 1 and r* |A| + |C| + 1.
What happens on an access?
We have r = |A| + |B| + 1 and r* |A| + |C| + 1.
r =
r=
When MTF moves x to the front, it creates |A| inversions and destroys |B| inversions.
Each transpose by OPT creates 1 inversion.
Thus, we have .
Amortized cost
The amortized cost for the i-th operation of MTF with respect to is
Amortized cost
The amortized cost for the i-th operation of MTF with respect to is
Amortized cost
= 2r + 2(|A| – (r – 1 – |A|) + ti)
The amortized cost for the ith operation of MTF with respect to is (since r = |A| + |B| + 1)
Amortized cost
= 2r + 2(|A| – (r – 1 – |A|) + ti)
= 2r + 4|A| – 2r + 2 + 2ti
The amortized cost for the ith operation of MTF with respect to is
Amortized cost
= 2r + 2(|A| – (r – 1 – |A|) + ti)
= 2r + 4|A| – 2r + 2 + 2ti
= 4|A| + 2 + 2ti
The amortized cost for the ith operation of MTF with respect to is
Amortized cost
= 2r + 2(|A| – (r – 1 – |A|) + ti)
= 2r + 4|A| – 2r + 2 + 2ti
= 4|A| + 2 + 2ti
The amortized cost for the ith operation of MTF with respect to is (since r* = |A| + |C| + 1 |A| + 1)
Amortized cost
= 2r + 2(|A| – (r – 1 – |A|) + ti)
= 2r + 4|A| – 2r + 2 + 2ti
= 4|A| + 2 + 2ti
\le 4(r* + ti) 4ci*.
The amortized cost for the ith operation of MTF with respect to is
Bounding the actual cost
*Thus, we have
In terms of the amortized cost
Thus, we have
The potentials cancel out
Thus, we have
The grand finale
Thus, we have since and .
So all potentials cancel out, Except for first and last, and the sum of true Costs is at most the sum of amortized costs.
Addendum
If we count transpositions that move x toward the front as “free” (models splicing x in and out of L in constant time), then MTF is 2-competitive.