1/23
🎯 Objective: Compress a color image using DCT + Huffman encoding, then reconstruct it. We apply the process to each RGB channel separately.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Q: Why convert the image to double
after reading it?
A: Because DCT requires floating-point values for accurate transformation.
Q: Why not convert this image to grayscale?
A: Because in this version, we want to compress and reconstruct the full color image using its R, G, B channels separately.
Q: What does img(:,:,c)
mean?
A: It selects the c-th color channel (R, G, or B) from the 3D RGB image.
Q: Why apply DCT to each channel?
A: To convert the image into the frequency domain where most energy is concentrated in fewer coefficients, making compression more effective.
Q: Why do we round the DCT coefficients?
A: To reduce precision, increasing repeated values which improves Huffman compression.
Q: Why do we flatten the DCT matrix before encoding?
A: Huffman encoding operates on 1D sequences, so we reshape the 2D matrix into a vector.
Q: What is the purpose of huffmandict()
and huffmanenco()
?
A: They create a codebook and compress the data based on symbol frequencies.
Q: What is huffmandeco()
used for?
A: To decompress the encoded bitstream using the same Huffman dictionary.
Q: Why do we reshape the decoded data?
A: To return it to its original 2D matrix form for inverse DCT.
Q: What does idct2()
do?
A: It reconstructs the image from its DCT coefficients back to pixel values.
Q: Why do we cast the final image to uint8
before displaying?
A: Because image pixel values must be in the range 0–255 and uint8
ensures proper display format.
Q: How is compression ratio calculated?
A: By dividing the total original bits by the total compressed bits.
Q: What are the two types of compression used here?
A:
Lossy: DCT + Rounding
Lossless: Huffman Encoding
Q: Why do we flatten the DCT matrix?
A: Huffman encoding requires a 1D sequence of symbols to compress.
Q: What are "symbols" in Huffman encoding?
A: The unique values (integers) in the data that will each get a unique binary code.
Q: Why do we calculate symbol probabilities?
A: So that more frequent values can be assigned shorter Huffman codes.
Q: What does huffmandict()
do?
A: It creates a dictionary that maps each symbol to a Huffman binary code based on frequency.
Q: What does huffmanenco()
do?
A: It encodes the input vector into a compressed bitstream using the Huffman dictionary.
Q: Why do we store the dictionary and original size with the encoded data?
A: So we can properly decode and reshape the image during reconstruction.
Q: What does huffmandeco()
do?
A: It decodes a compressed Huffman bitstream back into its original sequence of symbols (numbers).
Q: What inputs are needed for Huffman decoding?
A: The compressed bitstream and the Huffman dictionary used for encoding.
Q: Why do we use reshape()
after decoding?
A: Because the DCT data was originally a 2D matrix and we need to restore its shape before applying inverse DCT.
Q: What does size(channel)
do in reshape()
?
A: It ensures we reshape the decoded data to exactly match the original image dimensions.