Computer Science: Number Systems and Data Representation

Introduction to the Number System

Conceptual Overview: Modern computing relies on different number systems to represent data, primarily Binary, Denary, and Hexadecimal.
Denary System (Base 10):
* Uses 10 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
* Place values are based on powers of 10:
* $10^0 = 1$ (Ones)
* $10^1 = 10$ (Tens)
* $10^2 = 100$ (Hundreds)
* Explanation: To find a value, multiply the digit by the place value and sum the results.
* Example (365): $(3 \times 100) + (6 \times 10) + (5 \times 1) = 365$ .

The Binary System

Motivation:
* Computers consist of millions of tiny switches that can exist in only two states: On or Off.
* 1 represents On; 0 represents Off.
* Because computers only process electronic signals, all information must be transformed into binary format to be processed.

Structure (Base 2):
* Uses 2 digits: 0 and 1.
* Place values are based on powers of 2:
* $2^0 = 1$         * $2^1 = 2$         * $2^2 = 4$         * $2^3 = 8$         * $2^4 = 16$         * $2^5 = 32$         * $2^6 = 64$         * $2^7 = 128$

Conversion Procedures: Binary and Denary

Binary to Denary Conversion:
* Method: Multiply the digit value (0 or 1) by its corresponding place value ( $2^n$ ) and sum the results.     * Example 1 (111): $(1 \times 4) + (1 \times 2) + (1 \times 1) = 7$ .
* Example 2 (1011): $(1 \times 8) + (0 \times 4) + (1 \times 2) + (1 \times 1) = 11$ .
* DIY Exercise: The binary form of 15 is $1111$ ( $8 + 4 + 2 + 1$ ).
* DIY Exercise: The denary form of "1010" is 10 ( $1 \times 8 + 1 \times 2$ ).
Denary to Binary Conversion (Method 1 - Place Value Subtraction):
* Identify the largest power of 2 that fits into the denary number.
* Place a '1' in that column and subtract the value from the total; Repeat until the total is 0.
* Example (5 to Binary): $5 = 4 + 1$ . Place a 1 in column 4, 0 in column 2, and 1 in column 1. Result: $101$ .
Denary to Binary Conversion (Method 2 - Successive Division):
* Procedure: Divide the denary number by 2 and record the remainder (0 or 1). Continue dividing the quotient by 2 until the result is 0. Read the remainders from bottom to top.
* Example (39 to Binary):
* $39 \div 2 = 19$ remainder 1
* $19 \div 2 = 9$ remainder 1
* $9 \div 2 = 4$ remainder 1
* $4 \div 2 = 2$ remainder 0
* $2 \div 2 = 1$ remainder 0
* $1 \div 2 = 0$ remainder 1
* Result: $100111$

The Hexadecimal System

Motivation:
* Hexadecimal is used because it is easier for humans to read/write.
* It uses fewer characters and is less error-prone when copying data.
Structure (Base 16):
* Uses 16 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F.
* Mapping to Denary:
* $A = 10$
* $B = 11$
* $C = 12$
* $D = 13$
* $E = 14$
* $F = 15$
* Place Values: $16^0 = 1$ , $16^1 = 16$ , $16^2 = 256$ , $16^3 = 4096$ .

Conversion Procedures: Hexadecimal, Binary, and Denary

Binary to Hexadecimal Conversion:
* Principle: Since $16 = 2^4$ , four binary digits (a nibble) are equivalent to one hexadecimal digit.
* Procedure: Group the binary string into sets of 4 bits (starting from the right). Convert each group individually.
* Example (101111100001):
* 1011 = B
* 1110 = E
* 0001 = 1
* Result: BE1.
Hexadecimal to Binary Conversion:
* Procedure: Convert each hex digit into its 4-bit binary equivalent.
* Example (F935):
* F = 1111
* 9 = 1001
* 3 = 0011
* 5 = 0101
* Result: 1111 1001 0011 0101
Hexadecimal to Denary Conversion:
* Method: Multiply the digit value by its place value ( $16^n$ ) and sum.
* Example (45A): $(4 \times 256) + (5 \times 16) + (10 \times 1) = 1024 + 80 + 10 = 1114$ .
* DIY Exercise (BF08): $(11 \times 4096) + (15 \times 256) + (0 \times 16) + (8 \times 1) = 45056 + 3840 + 0 + 8 = 48904$ .
Denary to Hexadecimal Conversion:
* Procedure: Successive division by 16. Record remainders.
* Example (2004 to Hex):
* $2004 \div 16 = 125$ remainder 4
* $125 \div 16 = 7$ remainder 13 (D)
* $7 \div 16 = 0$ remainder 7
* Result: 7D4

Binary Arithmetic: Addition and Shifting

Addition Rules:
* $0 + 0 = 0$
* $0 + 1 = 1$
* $1 + 0 = 1$
* $1 + 1 = 10$ (0 carry 1)
* $1 + 1 + 1 = 11$ (1 carry 1)
Overflow Condition:
* An 8-bit binary register can hold a maximum value of 255 ( $2^8 - 1$ ).
* If an addition results in a 9th bit, this is an overflow error.
* The sum is too large to be stored in the assigned 8 bits.
* Example: $110_{10} + 222_{10} = 322_{10}$ . Since 322 > 255, overflow occurs.
Binary Shifting:
* Used by the CPU for rapid multiplication and division.
* Multiplication (Left Shift): Shift bits to the left and fill empty spaces with 0.
* 1 place shift = $\times 2$
* 2 place shift = $\times 4$
* $n$ place shift = $\times 2^n$
* Data Loss: Shifting a significant bit (1) beyond the register boundary results in overflow and loss of precision.
* Division (Right Shift): Shift bits to the right.
* 1 place shift = $\div 2$
* $n$ place shift = $\div 2^n$
* Precision Loss: Shifting the "Least Significant Bit" off the right side results in discarded fractional data (e.g., instead of 5.9, you get 5).

Two's Complement (Negative Binary Representation)

Definition: A method for processors to represent negative numbers.
The Sign Bit: In an 8-bit two's complement system, the leftmost bit (8th column) is the sign bit with a value of $-128$ .
* 0 = Positive
* 1 = Negative
Conversion (Positive to Two's Complement):
* Step 1: Convert the positive number to 8-bit binary.
* Step 2: Ensure the leftmost bit is 0.
* Example (13): Result is $00001101$ .
Conversion (Negative Denary to Two's Complement):
* Step 1: Convert the absolute value (positive form) to binary.
* Step 2: Invert all bits (0 becomes 1, 1 becomes 0).
* Step 3: Add 1 to the result.
* Example (-67):         1. $67 = 01000011$         2. Invert = $10111100$         3. Add 1 = $10111101$
Conversion (Two's Complement to Denary):
* Sum the values, treating the leftmost bit as $-128$ .
* Example binary (10110011): $-128 + 32 + 16 + 2 + 1 = -77$ .

Data Representation: Text, Sound, and Image

Text Representation:
* Every character (letters, spaces, punctuation) is assigned a binary code via a Character Set.
* ASCII: Uses 7 bits (128 characters). Includes 26 uppercase, 26 lowercase, digits, and punctuation.     * Extended ASCII: Uses 8 bits (256 characters) to include non-English alphabets and graphics.
* Unicode: Uses variable length (often 16 bits), enabling over 65,000 characters. Support for worldwide languages, emojis, and symbols.
Sound Representation:     * Analog signals must be sampled to be converted to digital format.
* Sample: A digital recording of the sound wave amplitude at a specific time.
* Sampling Resolution (Bit Depth): The number of bits used to represent the amplitude. Higher resolution increases accuracy and dynamic range.
* Sample Rate: The number of samples taken per second, measured in Hertz (Hz).
* File Size Impact: Higher rates and bit depths lead to better quality and less distortion but produce larger files.
* CD Quality: 16-bit sampling resolution and 44.1 kHz sample rate (44,100 samples/sec).
Image Representation:
* Bitmap: A collection of bits that form a grid of pixels (picture elements).
* Colour Depth: The number of bits used per pixel to represent color.
* 1-bit: 2 colors (0, 1)
* 2-bit: 4 colors
* 8-bit: 256 colors
* 24-bit: Over 16 million colors
* Image Resolution: Total number of pixels in the grid (e.g., 1024 x 1080). Higher resolution improves detail but increases file size.

Data Storage and Calculations

Basic Units:
* 1 bit: Smallest unit (0 or 1).
* 1 nibble: 4 bits.
* 1 byte: 8 bits.

Binary Prefixes (Base 2):
* 1 Kibibyte (KiB): $2^{10} = 1024$ bytes.
* 1 Mebibyte (MiB): $2^{20} = 1,048,576$ bytes.
* 1 Gibibyte (GiB): $2^{30} = 1,073,741,824$ bytes.
* 1 Tebibyte (TiB): $2^{40}$ bytes.
* 1 Pebibyte (PiB): $2^{50}$ bytes.
File Size Formulas:
* Image Size (bits): $\text{Resolution (Pixels)} \times \text{Colour Depth (bits)}$
* Sound Size (bits): $\text{Sample Rate (Hz)} \times \text{Sample Resolution (bits)} \times \text{Duration (seconds)} \times \text{Channels}$     * Note: For stereo sound, multiply the total by 2.

Data Compression

Benefits: Saves storage space, reduces streaming time, decreases upload/download times, and reduces costs.
Lossy Compression:
* Eliminates unnecessary data permanently. The original file cannot be reconstructed.
* JPEG: Removes color shades humans cannot discern.
* MPEG-3 (MP3): Removes sounds outside human hearing range and uses perceptual music shaping.
* MPEG-4 (MP4): Used for multimedia/video.
Lossless Compression:
* None of the original detail is lost. Necessary for files like spreadsheets or programs where data loss would make the file unusable.
* Run-Length Encoding (RLE): Identifies adjacent, identical data items and encodes one value representing the item and another representing the count.
* Example: "ssssrrrrkkkjjjjj" becomes "4s4r3k5j".
* Limitation: Does not work well without repeated adjacent data.

Practical Applications of Hexadecimal

MAC Address (Media Access Control):
* A unique identifier assigned to a Network Interface Card (NIC).
* Format: Six pairs of hex digits (e.g., $97\text{-}5C\text{-}E1\text{-}39\text{-}4B\text{-}97$ ).
* First half: Identity of the manufacturer.
* Second half: Identity number of the device.
IP Addresses:
* IPv4: 32-bit (often decimal or hex; e.g., 128.65.152.11).
* IPv6: 128-bit, divided into 16-bit segments and represented in hexadecimal.
HTML Colour Codes: Used in web design to define colors using hex triplets (e.g., #FF5733).
Other Uses: Assembly language, Error codes (referencing memory locations), Memory dumps/locations, and URLs.

Questions & Discussion

Q: Why use hex for MAC addresses?
* A: It is shorter, uses fewer characters, easier to understand/read, and less likely to contain mistakes during manual entry.
Q: What happens if you add two 8-bit binaries and get a 9-bit result?
* A: This is an overflow error; the cumulative value exceeds the capacity of the byte-sized register.
Q: Is HTML a programming language?
* A: No, it is a markup language used for the processing, definition, and presentation of text on web pages.
Q: Comparison of Lossy vs Lossless for sound?
* Advantage (Lossy): The file size is significantly smaller and requires less storage.
* Disadvantage (Lossy): Sound quality is reduced and the original state cannot be restored.
Q: Why use lossless for programs/executable files?
* A: Lossy compression would remove actual data/code. Programs need every bit intact to run correctly; otherwise, they will not work or will crash.