1/32
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Define a communication channel
Accepts string of symbols from finite alphabet, outputs string of symbols from another finite alphabet. Modelled by the probabilities P(y1…yn received| x1…xn sent)
Define a discrete memoryless channel
p_ij = P(bj received | ai sent) are same for each channel use and independent of past and future use.
Define channel matrix of a discrete memoryless channel
r x s stochastic matrix P=(p_ij) (where p_ij = P(bj received | ai sent)
What is the capacity of a channel
Supremum of all reliable transmission rates
Define a code
function f: A→B where A,B alphabets
Define the codewords of f
Elements in Im(f)
When is a code c decipherable
If c* is injective
Give some codes which are always decipherable
Block codes, comma codes, prefix-free codes
State Kraft’s inequality
sum from I=1 to m of a^-si <= 1 (m is size of alphabet of A, a is size of alphabet of B, f: Σ1 → Σ2* is a code with codewords of length s1,…,sm)
Give an equivalent condition to a prefix-free code existing
Kraft’s inequality holds
Prove a prefix free code exists iff Kraft’s inequality holds.
(forward) Infinite tree, prefix free so no shared ancestors, pump in 1 unit water, each node gets a^-si water, sums to no more than 1.
(backward) s1<s2…<sm, construct codewords one by one, ensuring none of previous are prefixes. If no option for sr, then constructing above tree gives sum i=1 to r-1 a^-si = 1, so sum I=1 to m a^-si>1, contradiction.
What does any decipherable code satisfy (McMillan/Karush)
Kraft’s inequality
Prove any decipherable code satisfies Kraft’s inequality.
Send R codewords in a row, simplify expression, b_l <= |Σ2^l| =a^l as decipherable, give inequality again, R root, as R to infinity, RHS tends to 1
Define the entropy of a random variable
H(X)=H(p1, …, pn) = -sum from i=1 to n of (pi log(pi)) (= -E[log pi]) (logs base 2).
RV X takes values x1, …, xn with probabilities p1,…,pn
State Gibbs inequality
-sum(pi log(pi)) <= - sum(pi log(qi)) where (p1,…,pn) and (q1,…qn) are discrete probability distributions.
When does equality hold for Gibb’s inequality
If pi=qi
Prove Gibb’s inequality
log x = ln x/ln 2 so use ln and divide through after, ln x <= x-1 (equal iff x=1)
define I, ln qi/pi
sum i in I of pi ln(pi/qi), RHS <=0
result, equality if qi sum = 1 and qi/pi=1 for all I, so pi=qi.
Give an inequality regarding H(p1,…,pn), when does equality hold?
H(p1,…,pn) <= log n, with equality iff p1=…=pn
Define a Bernoulli (or memoryless) source
a sequence X1, X2, . . . of independently and identically distributed random variables.
When is a code called optimal
When it has the smallest possible expected word length (sum pi si = E[S]) of all decipherable codes
State Shannon’s Noiseless Coding Theorem
Expected word length of an optimal decipherable code satisfies H(X)/log a <= E[S] < H(X)/log a +1
When does equality hold in Shannon’s Noiseless Coding Theorem?
If pi=qi, so pi=a^(-si)
Prove Shannon’s Noiseless Coding Theorem
lower bound: Let qi=a^(-si)/c, c=sum a^-si<=1, so sum qi=1,
Gibb’s inequality, simplify,
McMillans inequality (log d neg), so first result
(qi = pi, that is if pi = a^−si).
upper bound: construct shannon fano code, let si= roof(-log_a pi), so si < -log_a pi +1, so a^-si<=pi (reverse inequality and exponentiate)
Krafts satisfied, so prefix code exists with (calculate E[S]).
If we have some messages in A with probabilities p1,…,pm and c is an optimal prefix-free code with word lengths l1,…,lm, then give two statements we can deduce and prove them.
If pi>pl then li<=lj, else can swap the ith and jth codewords.
Among all codewords of maximal length, there must be two which only differ by the last digit, else can just delete the last digit, and as prefix-free, we get another prefix-free code with smaller expected word length.
What can we say Huffman codes are?
Optimal
Prove that Huffman codes are optimal.
Induction, base case m=2 clear. Suppose true for m-1, and suppose we combine two least likely of our m probabilities to get m-1, then E[Sm]=E[Sm-1] + p_(m-1) + p_m.
Other optimal code c’m, wlog prefix free and by prev lemma, codewords of max length differ by final symbol, (so c’m(mu_(m-1))=y0 and c’m(mu_m)=y1 for some y in {0,1}*). Make c’(m-1) be prefix free code for X_m-1, E[S’m]=E[S’_{m-1}] +pm-1+pm, E[Sm-1]<=E[S’m-1] as by induction cm-1 is optimal, so E[Sm]<=E[S’m] so cm optimal.
State the Asymptotic Equipartition Property
Source X1,X2,… satisfies the AEP with constant h>=0 if for all ε>0 there is some N:
(1) P[(X1,…,Xn) in Tn] >1-ε
(2) 2^{-N(H+ε)} <= p(x1,…xn) <= 2^{-N(H-ε)} for all (x1,…,xn) in Tn
Tn is typical set.
Define when a source is reliably encodable at rate r
If for each n, there is A_n in Σ^n where:
(1) log|A_n| x 1/n → r as n→ infinity
(2) P[(X1,…Xn) in An]→1 as n→ infinity
What is the information rate H of a source
infimum of all rates at which it is reliably encodable
State Shannon’s First Coding Theorem
If a source satisfies AEP with constant H, then source has information rate H.
Give a second definition of the asymptotic equipartition property
Source X1,X2,.. satisfies AEP if for some H>=0, we have -1/n log p(x1,…,xn)—p→H as n→ infinity (first arrow means convergence in probability).
State the weak law of large numbers
For iid RVs X1,X2,…with finite expected value E[Xi]=μ:
1/n sum i=1 to n Xi —p→ μ as n→ infinity
What’s the information rate of a memoryless source
Its entropy H(X)