Lossy Compression Algorithms

Introduction

  • Lossy compression achieves higher compression ratios by approximating original data.

Distortion Measures

  • Mean Square Error (MSE): σ2=1NΣ<em>n=1N(x</em>ny<em>n)2σ^2 = \frac{1}{N} Σ<em>{n=1}^N (x</em>n − y<em>n)^2 where x</em>nx</em>n is input, yny_n is reconstructed data, and NN is data length.
  • Signal to Noise Ratio (SNR): SNR=10log<em>10σ</em>x2σ<em>d2SNR = 10 log<em>{10} \frac{σ</em>x^2}{σ<em>d^2} in dB, where σ</em>x2σ</em>x^2 is the average square value of the original data sequence and σd2σ_d^2 is the MSE.
  • Peak Signal to Noise Ratio (PSNR): PSNR=10log<em>10x</em>peak2σd2PSNR = 10 log<em>{10} \frac{x</em>{peak}^2}{σ_d^2}

Rate-Distortion Theory

  • Framework for studying tradeoffs between rate and distortion.
  • R(D) represents the Rate Distortion Function.

Quantization

  • Reduces distinct output values, causing loss in compression.
  • Forms:
    • Uniform: Midrise and midtread quantizers.
    • Nonuniform: Companded quantizer.
    • Vector Quantization.

Uniform Scalar Quantization

  • Divides input values into equally spaced intervals (step size ∆).
    • Midrise: Even output levels.
    • Midtread: Odd output levels, including zero.
  • Equations for ∆ = 1:
    • Midrise: Qmidrise(x)=[x0.5]Q_{midrise}(x) = [x - 0.5]
    • Midtread: Qmidtread(x)=[x+0.5]Q_{midtread}(x) = [x + 0.5]
  • Rate: R=[log2M]R = [log_2 M] for M levels.
Quantization Error of Uniformly Distributed Source
  • Granular distortion: Quantization error for bounded input.

Companded Quantization

  • Nonlinear quantization using compressor (G), uniform quantizer, and expander (G1G^{-1}).
  • Common companders: µ-law and A-law.

Vector Quantization (VQ)

  • Operates on vectors of samples.
  • Uses code vectors; a collection of these code vectors form the codebook.

Transform Coding

  • Transforms input vector X to Y for efficient coding by decorrelating components.
  • Can coarsely quantize or set to zero the components with little signal distortion.
  • Focus on Discrete Cosine Transform (DCT) and Karhunen-Loève Transform (KLT).

Spatial Frequency and DCT

  • Spatial frequency: Pixel value changes across an image block.
  • DCT decomposes a signal into DC and AC components; IDCT reconstructs the signal.
Definition of DCT
  • 2D DCT: F(u,v)=2MNC(u)C(v)Σ<em>i=0M1Σ</em>j=0N1f(i,j)cos((2i+1)uπ2M)cos((2j+1)vπ2N)F(u, v) = \frac{2}{\sqrt{MN}} C(u) C(v) Σ<em>{i=0}^{M-1} Σ</em>{j=0}^{N-1} f(i, j) cos(\frac{(2i + 1)uπ}{2M}) cos(\frac{(2j + 1)vπ}{2N})
    • Where C(ξ)=12C(ξ) = \sqrt{\frac{1}{2}} if ξ=0ξ = 0, otherwise 1.
Properties of DCT
  • It's a linear transform: T(αp+βq)=αT(p)+βT(q)T(αp + βq) = αT(p) + βT(q).
  • Separable: Can be performed as a sequence of 1D DCT steps.
Comparison of DCT and DFT
  • DCT uses only the real part of DFT by creating a symmetric copy of the input signal.

Karhunen-Loève Transform (KLT)

  • Reversible linear transform that decorrelates the input signal optimally.
KLT Mechanics
  • Autocorrelation matrix of input vector X is RX=E[XXT]RX = E[XX^T].
  • Goal: Find a transform T such that E[Y<em>tY</em>s]=0E[Y<em>t Y</em>s] = 0 if tst ≠ s.
  • KLT is defined as T=[u<em>1,u</em>2,,u<em>k]TT = [u<em>1, u</em>2, …, u<em>k]^T, where u</em>iu</em>i are eigenvectors.

Wavelet-Based Coding

  • Decomposes input signal for easier handling and compression by thresholding.
  • Basis functions are localized in time and frequency.
  • Types: Continuous Wavelet Transform (CWT) and Discrete Wavelet Transform (DWT).

Continuous Wavelet Transform

  • Wavelet admissibility condition: +ψ(t)dt=0\int_{-\infty}^{+\infty} ψ(t) dt = 0
  • CWT of fL2(R)f ∈ L^2(R): W(f,s,u)=<em>+f(t)ψ</em>s,u(t)dtW(f, s, u) = \int<em>{-\infty}^{+\infty} f(t) ψ</em>{s,u}(t) dt

Discrete Wavelet Transform

  • Uses discrete steps for scale and shift.
  • Connects wavelets to filter banks in multiresolution analysis.
  • ψj,n(t)=2j/2ψ(2jtn),(j,n)Z2{ψ_{j, n}(t) = 2^{j/2} ψ(2^j t − n), (j,n)∈Z^2} forms an orthonormal basis of L2(R)L^2(R).
Multiresolution Analysis in the Wavelet Domain
  • Adapts signal resolution to relevant details.
  • Wavelet functions ψ(t) characterize detail information; scaling function φ(t) provides approximation information.
  • Dilation equation: φ(t)=Σ<em>nZ2h</em>0[n]φ(2tn)φ(t) = Σ<em>{n∈Z} \sqrt{2} h</em>0[n] φ(2t − n)
Wavelet Transform Example
  • Discrete Haar wavelet transform replaces original sequences with pairwise averages and differences.

Biorthogonal Wavelets

  • Analysis and synthesis filters are not identical.

2D Wavelet Transform

  • Convolves image rows and columns with filters h0[n] and h1[n], discarding odd-numbered elements and concatenating.
  • Produces subbands LL, HL, LH, and HH.

Wavelet Packets

  • Allows decomposition to be represented by any pruned subtree.
  • Best wavelet basis can be found within a large library of permissible bases.

Embedded Zerotree of Wavelet Coefficients

  • Efficient for image coding, addressing best image quality for a given bit-rate.

The Zerotree Data Structure

  • Codes significance map using zerotree.

Successive Approximation Quantization

  • Applies a sequence of thresholds to determine the significance of each coefficient.
Dominant Pass
  • Compares coefficients to threshold to determine significance and zerotree codes the resulting significance map
Subordinate Pass
  • Refines magnitudes of coefficients on subordinate list.