Notes on Information representation and multimedia (summary)

1.1 Data representation

  • Core idea: how computers represent data in binary form, different numeral systems, character encoding, and how these representations affect storage and processing.
  • Key terms
    • Binary: base-2 number system using values 0 and 1.
    • Bit: binary digit.
    • One’s complement: invert every bit to represent negatives; example: 01011010 (90) → 10100101 (−90).
    • Two’s complement: invert bits and add 1 to get negative; simplifies binary arithmetic for signed numbers.
    • Sign and magnitude: sign bit (0 = +, 1 = −) with remaining bits for magnitude.
    • Hexadecimal: base-16 system using digits 0–9 and letters A–F; weights 16^n.
    • Memory dump: contents of computer memory printed to screen or paper.
    • Binary-coded decimal (BCD): use 4 bits to represent each decimal digit (0–9).
    • ASCII: coding system for characters on a keyboard and control codes (7-bit standard; 0–127).
    • Character set: list of defined characters that hardware/software can represent.
    • Unicode: encoding system intended to represent all languages; supports many characters; first 128 common with ASCII; 16/32-bit encodings common; up to four bytes per character.
  • What you should already know (overview of concepts to warm up):
    • Column weightings for binary and hexadecimal numbers; binary addition/subtraction; converting between binary/denary/hexadecimal; signs using two’s complement.
    • Why memory sizes use different units (bytes, kilobytes, mebibytes, etc.) and the distinction between SI prefixes (kilo = 1000) and IEC prefixes (kibi = 1024).
    • How to interpret memory dumps and why hexadecimal is used for debugging.
  • Connections to foundational principles
    • The need for standardized encoding to convert human-readable data into machine-readable bits.
    • Trade-offs between compactness (compression) and exactness (lossless vs. lossy representations).
  • Practical implications
    • Monetary values often require exact representation; BCD is used to avoid rounding errors in fixed-point arithmetic.
    • ASCII vs. Unicode affects multilingual support, storage size, and compatibility.

1.1.1 Number systems

  • Decimal (denary) is base-10; digits 0–9, weights: 10^0, 10^1, 10^2, … (least significant to most significant).
    • Example: 31,421 is 3×10^4 + 1×10^3 + 4×10^2 + 2×10^1 + 1×10^0.
  • Binary (base-2) uses digits 0 and 1; digits are called bits; weighted columns.
  • Relationship to computer storage
    • Digital switches ON=1, OFF=0; any data ultimately stored as binary digits.

1.1.2 Binary number system

  • Weightings for 8-bit binary numbers (most-significant to least-significant): 2^7 \text{(128)}, \ 2^6 \text{(64)}, \ 2^5 \text{(32)}, \ 2^4 \text{(16)}, \ 2^3 \text{(8)}, \ 2^2 \text{(4)}, \ 2^1 \text{(2)}, \ 2^0 \text{(1)}
    • In decimal: 128, 64, 32, 16, 8, 4, 2, 1
  • Converting binary to denary (example): if bits 1 appear in a column, add the column value.
    • Example: binary 1110 1110 (8 bits) → 128 + 64 + 32 + 8 + 4 + 2 = 238_{denary}.
  • Converting denary to binary has two common methods:
    • Method 1: place 1s in appropriate positions to sum to the denary value.
    • Method 2: successive division by 2; write remainders bottom-to-top.
  • Binary arithmetic for signed numbers uses two’s complement (preferred in this text):
    • One’s complement: invert all bits (0↔1).
    • Two’s complement: invert all bits and add 1 to the least significant bit.
    • Benefits: simplifies addition/subtraction of signed numbers, avoids separate subtraction logic.
  • 8-bit two’s complement example and range
    • Example value: +90 (0101 1010) → −90 (two’s complement) becomes 1010 0110 after conversion in 8-bit representation (illustrative).
    • Range for 8-bit two’s complement: -2^7 \leq N \leq 2^7-1 which is [-128, 127].
  • Activity/Exercises (typical questions students practice)
    • Convert several 8-bit binary numbers to denary using two’s complement.
    • Convert several denary numbers to 8-bit binary two’s complement.
    • Perform binary addition and subtraction with overflow awareness.

1.1.3 Hexadecimal number system

  • Hexadecimal is base-16 with digits 0-9, A-F; weights are 16^n: 16^3, 16^2, 16^1, 16^0.
    • Example digits: A=10, B=11, C=12, D=13, E=14, F=15.
  • Relationship to binary: one hex digit corresponds to four binary digits because 16 = 2^4.
  • Converting between binary and hex:
    • Binary to hex: group bits into 4-bit chunks from right to left; leftmost group may be shorter; translate each 4-bit group to a hex digit using the 16-table.
    • Hex to binary: replace each hex digit with its 4-bit binary equivalent.
  • Practical use: memory dumps are often shown in hexadecimal for readability and ease of tracing memory contents.
  • Example conversions (per text): 8-bit binary 1011 1110 0011 0001 → hex B E 1 1 (and similar examples).
  • Table reference: hex–binary–denary mapping for quick lookup (Table 1.3 in the source).

1.1.4 Binary-coded decimal (BCD)

  • BCD represents each decimal digit with 4 bits, using codes 0000 to 1001 for digits 0–9.
    • Example: the denary number 3165 becomes BCD as 0011 0001 0110 0101 (per digit).
  • Two methods of storing BCD digits:
    • Four separate 4-bit codes (one per decimal digit) stored as four bytes or 4 nibbles.
    • Two bytes storing two BCD digits per byte (two 4-bit codes per byte).
  • Uses and significance:
    • Useful for monetary values and fixed-point representations to avoid decimal rounding issues when displaying to users.
    • Allows precise decimal digits display (e.g., fixed-point currency like $1.31).
  • Extension: discuss issues arising when adding BCD digits and how binary arithmetic must accommodate carries into a decimal digit that would otherwise exceed 9.

1.1.5 ASCII codes and Unicode

  • ASCII (7-bit) codes:
    • Range: 0-127 (0x00–0x7F in hex).
    • Includes letters, digits, punctuation, and 32 control codes (0–31).
    • Extended ASCII uses 8 bits (0–255) to support additional symbols and characters.
  • Examples and relationships:
    • Uppercase letters (A–Z) and lowercase (a–z) are assigned distinct codes; the 6th bit often differentiates case (e.g., 0x41 for 'A', 0x61 for 'a').
    • ASCII tables group characters in sequence to ease use.
  • Unicode:
    • Aims to represent all languages and scripts; first 128 characters overlap with ASCII.
    • Encoding sizes commonly used: 16-bit or 32-bit per character; up to 4 bytes per character in modern encodings (UTF-8/UTF-16/UTF-32 families).
    • Unicode goals include universal standard, more efficient encoding than ASCII, unambiguous encoding for each character, and private-use areas for user-specific characters.
  • Practical notes
    • ASCII uses 1 byte per character (in extended ASCII); Unicode can require more bytes per character (2–4 bytes typically).
    • Unicode enables global software compatibility across languages and platforms.
  • Additional data: sample Unicode character block shows extensive character sets beyond ASCII (Russian, Greek, Romanian, Croatian, etc.).

1.1.6 Memory sizes and IEC standard

  • Memory size terminology (Table 1.1 in the source):
    • 1 kilobyte (1 KB) = 1000 bytes (decimal SI unit).
    • 1 megabyte (1 MB) = 1,000,000 bytes.
    • 1 gigabyte (1 GB) = 1,000,000,000 bytes.
    • 1 terabyte (1 TB) = 1,000,000,000,000 bytes.
    • 1 petabyte (1 PB) = 1,000,000,000,000,000 bytes.
  • IEC (binary) prefixes offer more accurate representations for memory:
    • 1 kibibyte (1 KiB) = 2^{10} = 1024 bytes.
    • 1 mebibyte (1 MiB) = 2^{20} = 1,048,576 bytes.
    • 1 gibibyte (1 GiB) = 2^{30} = 1,073,741,824 bytes.
    • 1 tebibyte (1 TiB) = 2^{40} = 1,099,511,627,776 bytes.
    • 1 pebibyte (1 PiB) = 2^{50} = 1,125,899,906,842,624 bytes.
  • Rationale: IEC prefixes are more accurate for binary computer memory usage; RAM and internal memories are better described by the IEC system.
  • Practical example: a 64 GiB RAM can store 64 imes 2^{30} = 68{,}719{,}476{,}736 bytes.
  • Relevance to file sizing: helps avoid confusion when calculating file sizes or RAM requirements.

1.1.3 Hexadecimal number system (recap)

  • Hexadecimal as a bridge between binary and denary:
    • Weights: 16^3, 16^2, 16^1, 16^0 = 4096, 256, 16, 1.
    • Each hex digit corresponds to exactly four binary bits: 1 hex digit = 4 bits.
  • Software tooling uses hex for memory dumps and low-level data inspection due to compact readability of binary data.
  • Conversions:
    • To convert binary to hex: group into 4-bit chunks from right; pad leftmost chunk with zeros if needed.
    • To convert hex to binary: replace each hex digit with its 4-bit binary equivalent using a lookup table (Table 1.3 in the source).

1.1.7 Use of memory-related formats (memory dumps, tables)

  • Binary–to–denary/denary–to–binary conversion practice is commonly tested via activities (convert, overflow handling, etc.).
  • Memory dumps: hexadecimal representation of memory contents is easier to read and trace; essential for debugging and memory analysis.

1.2 Multimedia

  • Key terms
    • Bit-map image: image composed of pixels; each pixel has colour information stored as bits.
    • Pixel: smallest picture element; colour depth defines how many bits per pixel.
    • Colour depth: number of bits used to represent the colour of a pixel; e.g., 8-bit colour depth allows 2^8 = 256 colours.
    • Bit depth vs. colour depth: bit depth is the number of bits used for a single sample (e.g., a sample of sound or a single pixel); colour depth for images can be higher (e.g., 24-bit true colour).
    • Image resolution: total number of pixels in an image (e.g., 4096 × 3192 = 12,738,656 pixels).
    • Screen resolution: number of horizontal by vertical pixels on a display screen.
    • Pixel density: number of pixels per square centimeter; relates to perceived sharpness.
    • Vector graphics: images defined by 2D points, lines, curves, and properties; scalable without loss of quality.
    • Sampling resolution (bit depth) and sampling rate (samples per second): determine the fidelity of digitised sound.
    • Frame rate: number of video frames per second.
  • Bit-map images (section 1.2.1)
    • Stored as a 2D matrix of pixels; each pixel can be represented by 1, 8, 16, 24, or 32 bits, etc.
    • True colour: typically 24 bits per pixel (8 bits per colour channel: R, G, B).
    • Higher bit depth = more possible colours, larger file size.
    • Display considerations: if screen resolution < image resolution, scaling or cropping may be required.
  • Vector graphics (section 1.2.2)
    • Differences from bitmaps:
    • Vector graphics describe shapes via geometric primitives and attributes, not pixels.
    • Scaling can be done without loss of quality; smaller file sizes for simple graphics; not always realistic for photos.
    • When to use:
    • Resizeable graphics (logos, CAD drawings, exploded diagrams) are better as vectors.
    • Photographs are better as bitmaps (raster images).
    • Typical formats: vector: .svg, .cgm, .odg; bitmap: .jpeg, .bmp, .png.
  • Sound files (section 1.2.3)
    • Sound is analogue and must be digitised via an analogue-to-digital converter (ADC).
    • After conversion, sampling rate (samples per second) and sampling resolution (bit depth) determine fidelity and file size.
    • Higher sampling rate and resolution yield better sound but larger file sizes.
    • CDs commonly use 16-bit sampling (higher fidelity). Filtering: reduce frequencies outside human hearing to save space (perceptual shaping).
    • Amplitude and frequency determine the waveform; higher bit depth yields a larger dynamic range.
  • Video (section 1.2.4)
    • Digital video typically stores frames as a sequence of images (frame rate); motion JPEG is a common encoding approach.
    • Video compression is essential for streaming and storage.

1.3 File compression

  • Objective: reduce file size while maintaining acceptable fidelity; two main categories: lossless and lossy.
  • Key terms
    • Lossless compression: original file can be perfectly reconstructed after decompression (e.g., Run-Length Encoding – RLE).
    • Lossy compression: some data is discarded; decompressed file is not identical to the original (e.g., MP3, JPEG).
    • JPEG: lossy image compression based on perceptual limitations of human vision.
    • MP3/MP4: lossy compression for audio and multimedia; MP4 can store audio, video, images, and animation.
    • Perceptual shaping: discards data outside the range of human perception to reduce file size while maintaining perceived quality.
    • Bit rate: number of bits per second in a stream; higher bit rates yield better quality but larger files.
    • Run-length encoding (RLE): a lossless technique that encodes runs of identical data as a count followed by the data value.
  • Lossless vs. lossy: key differences and when each is appropriate (e.g., documents vs. media).
  • File compression applications (MP3/MP4, JPEG, SVG)
    • MP3 uses perceptual encoding to reduce audio data by removing inaudible components; typical bit rates range from ~80 to ~320 kbps; 200 kbps is common for high-quality audio.
    • MP4 stores multimedia data (audio, video, images, and more) in a single container; supports streaming with reduced file sizes.
    • JPEG compresses bitmap images with lossy compression; commonly reduces file size by factor 5–15 depending on quality settings.
    • SVG is a text-based vector format; compression can be applied to the XML text (e.g., gzip).
  • Run-length encoding (RLE) (Section 1.3.1 & 1.3.3)
    • Concept: replace runs of identical data with a pair (count, value).
    • Example use: 8x8 grid of pixels (or a string of identical characters) -> reduced storage when runs are long.
    • Effectiveness depends on the data having long runs of identical values; less effective for highly varied data.
  • General methods of compression (Section 1.3.2)
    • Practical, non-algorithmic approaches for reducing file size:
    • Reduce sampling rate or sampling resolution (for audio) and frame rate (for video).
    • Crop or resize images.
    • Decrease colour depth/bit depth.
    • Reduce image resolution where acceptable.
  • Practical calculations and examples (from the chapters)
    • Bit-map file size estimate: for a full-screen image with resolution W imes H and bit depth b:
      ext{bits} = W \times H \times b,
      \quad \text{bytes} = \frac{W \times H \times b}{8}.
      Example: 1920 × 1080 with 24-bit color yields
      1920 \times 1080 \times 24 = 49{,}766{,}400\ \, \text{bits} which is
      6{,}220{,}800\ \, \text{bytes} = 6.22\,\text{MB (SI units)}.
  • Header information for image files
    • Important fields in a file header include: file type/format (e.g., .bmp or .jpeg), file size, image resolution, bit depth, and any compression method used.

Section-specific activities and notes (summary of typical exam-style prompts)

  • Convert various 8-bit binary numbers to denary using two’s-complement representation and determine the maximum range for 8-bit numbers.
  • Convert 8-bit binary numbers to BCD and vice versa; interpret BCD digits as decimal digits and understand its use in monetary values.
  • Explain why overflow can occur when adding two positive 8-bit numbers and how the 9th bit is treated in two’s-complement arithmetic.
  • Explain the difference between 8-bit signed magnitude, ones’ complement, and two’s-complement representations, with practical guidance on why two’s-complement is preferred.
  • Calculate the file size needed for a bit-map image given resolution and bit depth; discuss how headers affect total file size and why compression is used.
  • Compare bit-map and vector graphics in terms of scalability, realism, and typical use cases (logos vs. photos).
  • Explain sampling rate, sampling resolution (bit depth), frame rate, and their impact on sound/video quality and file size; give practical examples like CD-quality audio (16-bit, 44.1 kHz) vs. other formats.
  • Describe lossless vs. lossy compression, give examples (RLE, JPEG, MP3), and explain perceptual shaping.
  • Understand memory units and the difference between decimal (SI) prefixes and IEC binary prefixes (KiB, MiB, etc.).
  • Explain the ASCII and Unicode systems, including why Unicode is needed for multilingual text and how character size can vary (1 byte vs. 2–4 bytes per character).

Additional notes for quick reference (formulas and mappings)

  • Binary weights (8-bit): 2^7=128, 2^6=64, 2^5=32, 2^4=16, 2^3=8, 2^2=4, 2^1=2, 2^0=1
  • Two’s complement range for 8 bits: [-2^7, 2^7-1] = [-128, 127]
  • Hexadecimal to binary: one hex digit = four bits; example: hex "A7" = 1010 0111
  • Binary to hex: group into 4-bit chunks from the least significant end; pad on the left if needed.
  • Memory sizes (decimal vs. IEC):
    • SI: 1 KB = 1000 bytes, 1 MB = 10^6 bytes, 1 GB = 10^9 bytes, 1 TB = 10^12 bytes.
    • IEC: 1 KiB = 2^{10} = 1024 bytes, 1 MiB = 2^{20}, 1 GiB = 2^{30}, 1 TiB = 2^{40}.
  • Image data size example (bit-map): for 1920 × 1080 at 24-bit color:
    1920 \times 1080 \times 24 = 49{,}766{,}400\text{ bits}
    \frac{49{,}766{,}400}{8} = 6{,}220{,}800\text{ bytes} = 6.22\,\text{MB (SI)}.