Data Representation and Compression
Data and Binary Numbers
- Binary Numbers: Digital data uses binary numbers for numerical representation.
- Bit: The smallest unit of information, either 0 or 1.
Base Conversion
- Binary to Decimal: Convert binary to decimal by recognizing that binary digits represent powers of 2.
- Example: Binary number 1101.
- Decimal to Binary: Find powers of 2 that sum up to the decimal number.
- Start with the largest power of 2 less than the number, subtract, and repeat until reaching 0.
- Example: Decimal number 200.
Digital Images as Bits
- Digital Images: Images are converted to binary, processed, and displayed.
- Pixels: Digital images consist of pixels with binary numbers.
- Black and White Images: Represented using 1 (black/on) and 0 (white/off).
- Grid Creation: Draw a grid and color squares based on binary values.
- Metadata: Data needed to know the image size (e.g., 10 x 10 grid).
Binary and Color Representation
- Color Representation: Computers use binary for colors.
- Color Basis: Colors are created using red, green, and blue light.
- Maximum Color Value: 255 in decimal, represented as 11111111 in binary.
- Minimum Color Value: 0.
Music as Bits
- Analog Signal: Continuous in time and range of values.
- Digital Signal: Sequence of discrete symbols (bits).
- Sampling: Recording analog signals at discrete moments and converting to digital.
- Noise Resilience: Digital signals are more resilient against noise.
Data Compression
- Data Compression Usage: Used in MP3, MP4, RAR, ZIP, JPG, PNG files, etc.
- Importance: Important for backing up and archiving files, especially for internet uploads.
- Two-Way Process: Compression algorithms reduce data size, decompression restores the original form.
- Usefulness: Saves disk space and reduces bandwidth during data transmission.
- Function: Compresses a string of bytes to a smaller set of bytes.
- Lossless Algorithms: Reconstruct the original message exactly.
- Lossy Algorithms: Reconstruct an approximation of the original message.
- Used for images and sound where slight loss is acceptable.
Lossless Compression
- Function: Data is packed and decompressed without any loss of data. Exact reconstruction is possible.
- Text Compression: Crucial to ensure identical reconstruction because minor differences can alter meaning.
Lossy Compression
- Function: Digital data is not decompressed back to 100% of the original.
- Characteristics: Provides high compression but with some loss of original data (pixels, sound waves, etc.).
- Meaning of Lossy: Loss of a quantity such as a frequency component or noise.
- Examples:
- Images: High compression loss is noticeable when photos are enlarged.
- Music: Difference between MP3 and high-resolution audio.
- Video: Moving frames can handle more pixel loss than images.
Using Programs with Data
- Data Increase: Digitization and multiple transactions have led to a surge in data.
- Data Analysis: Analyzing large data sets helps categorize connections and find patterns.
- Data Extraction: Obtaining data from databases or software for use in other software.
- Process: Data extraction → transformation (filters/programs) → analysis (graphs, visualization).
- Steps to Extract and Analyze Data:
- Analyze data sources (web pages, emails, videos, audio, text, etc.).
- Determine the purpose of the analysis (trend, effect, cause, quantity).
- Decide on tools for reading data and repositories for storing data.
- Clean the data (whitespace, symbols, duplicates).
- Understand data patterns and text flow using visualization tools.
How to Read and Analyze Graphs
- Graph Definition: Pictorial representation used to depict data relationships.
- Representation: Data is represented in points, lines, bars, pie charts, and scatter plots.
- Types of Graphs:
- Picture Graphs: Use pictures to represent values.
- Bar Graphs: Use vertical or horizontal bars to represent values.
- Line Graphs: Use lines to represent values.
- Scatter Plots: Use points with a best-fit line.