Digital Image Compression and JPEG Images - lect 8

JPEG Images

JPEG is a prevalent image format.
Most digital cameras save images in JPEG format.
There are two main JPEG image formats, the original and a more recent one based on wavelets (which won't be covered).

Importance of Understanding Image Compression

Forensic image analysts emphasize the importance of understanding image compression and JPEG for presenting images in court.
Saving an image as JPEG results in loss of information, making it a lossy compression format.
However, the loss of information is usually small.

Why Compress Digital Images?

To reduce storage requirements.
To decrease bandwidth usage, enabling faster loading of images on websites.

Compression Ratio

Controlling the amount of compression allows for a trade-off between image quality and file size.
High compression ratios lead to distortion and loss of detail.
Lower compression ratios result in less distortion but require more storage.
It's a compromise between image quality and file size.

Storage Requirements for Uncompressed Images

Image dimensions: $M$ (rows) by $N$ (columns)
Number of color channels: $n$ (typically 3 for red, green, and blue)
Bits per pixel per color channel: $m$ (typically 4 or 8 bits)
Total storage requirements for uncompressed image: $M \,\times N \,\times m \,\times n$
Bitmap format is an example of an uncompressed image format.

Lossless vs. Lossy Compression

Non-lossy/lossless compression: The original image can be recovered exactly from the compressed file (e.g., PNG format).
Lossy compression: Information is lost during compression, resulting in an approximation of the original image (e.g., JPEG format).
Lossy compression achieves a much higher degree of compression compared to lossless compression.
PNG is suitable for retaining all image details, while JPEG is better for minimizing file size and bandwidth requirements.

Data Redundancy

All image compression schemes utilize data redundancy. There are 3 types:
1. Coding redundancy.
2. Interpixel redundancy.
3. Psychovisual redundancy.

Coding Redundancy

Caused by suboptimal code words for symbol encoding.
A symbol is typically a gray level within the image.
In natural images, intensity values do not occur with equal probability.
Assign shorter binary strings to frequently occurring intensity values to reduce storage requirements.
Huffman encoding is an optimal way to achieve this.
Example: A photograph of a polar bear in a snowstorm will contain mostly white pixels, so we assign a shorter string to encode them.

Interpixel Redundancy

Encoding the structure efficiently within the image.
Images have structure that can be utilized to reduce storage requirements.

Psychovisual Redundancy

Information within an image that is superfluous to interpretation or aesthetics.
If the eye can't perceive the detail, there is no need to encode it.
Spatial frequency is the rate of change from light to dark.
As spatial frequency increases, the ability to perceive differences diminishes.
As the amplitude of differences decreases, the ability to perceive differences diminishes.
If fluctuations between light and dark are imperceptible, they don't need to be encoded, allowing for compression by removing that information.

JPEG Images

JPEG is a prevalent, lossy image format widely used in digital cameras.
There are two main JPEG formats: original and wavelet-based (not covered here).

Importance of Understanding Image Compression

Forensic analysts need to understand image compression and JPEG for court.
JPEG compression is lossy but typically involves small information loss.

Why Compress Digital Images?

Reduces storage and decreases bandwidth for faster loading.

Compression Ratio

Balancing image quality and file size involves controlling compression.
High compression leads to distortion; lower compression requires more storage.

Storage Requirements for Uncompressed Images

Image dimensions: $M$ (rows) by $N$ (columns)
Color channels: $n$ (typically 3 for RGB)
Bits per pixel per channel: $m$ (typically 4 or 8 bits)
Total storage: $M \times N \times m \times n$
Bitmap format is uncompressed.

Lossless vs. Lossy Compression

Lossless: Original image recoverable (e.g., PNG).
Lossy: Information lost, approximates original (e.g., JPEG).
Lossy achieves higher compression.
PNG retains details; JPEG minimizes file size.

Data Redundancy

Image compression uses data redundancy:
1. Coding redundancy
2. Interpixel redundancy
3. Psychovisual redundancy

Coding Redundancy

Results from suboptimal code words.
Assign shorter strings to frequent intensity values (e.g., Huffman encoding).
Example: Polar bear photo with mostly white pixels.

Interpixel Redundancy

Efficiently encoding image structure to reduce storage.

Psychovisual Redundancy

Information superfluous to perception.
High spatial frequency or low amplitude differences are imperceptible and can be removed to compress.

JPEG Images

JPEG is a prevalent, lossy image format widely used in digital cameras.
There are two main JPEG formats: original and wavelet-based (not covered here).

Importance of Understanding Image Compression

Forensic analysts need to understand image compression and JPEG for court.
JPEG compression is lossy but typically involves small information loss.

Why Compress Digital Images?

Reduces storage and decreases bandwidth for faster loading.

Compression Ratio

Balancing image quality and file size involves controlling compression.
High compression leads to distortion; lower compression requires more storage.

Storage Requirements for Uncompressed Images

Image dimensions: $M$ (rows) by $N$ (columns)
Color channels: $n$ (typically 3 for RGB)
Bits per pixel per channel: $m$ (typically 4 or 8 bits)
Total storage: $M \times N \times m \times n$
Bitmap format is uncompressed.

Lossless vs. Lossy Compression

Lossless: Original image recoverable (e.g., PNG).
Lossy: Information lost, approximates original (e.g., JPEG).
Lossy achieves higher compression.
PNG retains details; JPEG minimizes file size.

Data Redundancy

Image compression uses data redundancy:
1. Coding redundancy
2. Interpixel redundancy
3. Psychovisual redundancy

Coding Redundancy

Results from suboptimal code words.
Assign shorter strings to frequent intensity values (e.g., Huffman encoding).
Example: Polar bear photo with mostly white pixels.

Interpixel Redundancy

Efficiently encoding image structure to reduce storage.

Psychovisual Redundancy

Information superfluous to perception.
High spatial frequency or low amplitude differences are imperceptible and can be removed to compress.