Chapter 2 - Data Representation

Chapter 2 Objectives

Understand the fundamentals of numerical data representation and manipulation in computer systems.
Master the skill of converting between different numeric-radix systems.
Understand how errors can occur in computations because of overflow and truncation.
Understand the fundamental concepts of floating-point representation.
Gain familiarity with the most popular character codes.

Outline

Converting between different numeric-radix systems
Binary addition and subtraction
Two’s complement representation
Floating-point representation
Characters in computer systems

2.1 Introduction

Bit: The most basic unit of information in a computer, representing a state of either "on" or "off", "high" or "low" voltage in a digital circuit. In binary, it can be either "1" or "0".
Byte: A group of 8 bits.
- The smallest possible unit of storage in computer systems.
Nibble: A group of 4 bits; a byte consists of 2 nibbles: the "high-order" nibble and the "low-order" nibble.
Word: A contiguous group of bytes (size can vary between computer systems, e.g., 2 bytes (16 bits), 4 bytes (32 bits), 8 bytes (64 bits)).
Most Significant Bit (MSB) and Least Significant Bit (LSB)
- For binary representation: 1011100110101011

Byte or Word Addressable

A computer may allow either a byte or word to be addressable. Definition of addressable: a particular unit of storage retrieved by the CPU based on its memory location.
- In a byte-addressable system, the smallest addressable unit is a byte.
- In a word-addressable system, it is a word.

2.2 Positional Numbering Systems

Bytes store numbers using the position of each bit to represent a power of 2 (radix of 2). The binary system is a base-2 system, while the decimal system is the base-10 system (powers of 10).
When the radix is different from 10, the base is denoted as a subscript (e.g., $947_{10}$ ).

Example of Base-10 Representation

Decimal Number: 947
- Detailed Representation: $947 = 9 \times 10^2 + 4 \times 10^1 + 7 \times 10^0$
Decimal Number: 5836.4710
- Detailed Representation: $5836.4710 = 5 \times 10^3 + 8 \times 10^2 + 3 \times 10^1 + 6 \times 10^0 + 4 \times 10^{-1} + 7 \times 10^{-2}$ $

Example of Base-2 Representation

Binary Number: 11001 (base-2)
- Conversion to Decimal: $11001 = 1 \times 2^4 + 1 \times 2^3 + 0 \times 2^2 + 0 \times 2^1 + 1 \times 2^0 = 25_{10}$ $

Practice Questions

Convert $(01111101)_2 = ?$
Convert $(123)_10 = ?$
Convert $(123)_3 = ?$

Importance of Binary System

Binary numbers underpin all data representations in computer systems.
Proficiency with binary is essential for understanding computer operations and instruction sets.

2.3 Converting Between Bases

Conversion of base-10 to other radices: Two primary methods for conversion are:
1. Subtraction Method
2. Division Method

Subtraction Method (Decimal to Base-3 Example)

Convert $190_{10}$ to base-3:
- Powers of 3: $3^0=1$ , $3^1=3$ , $3^2=9$ , $3^3=27$ , $3^4=81$ , $3^5=243$
- Step 1: For $3^4 = 81$
- $190 / 81 = 2$ (coefficient for $3^4$ )
- Remainder: $190 - (2 \times 81) = 190 - 162 = 28$
- Step 2: For $3^3 = 27$
- $28 / 27 = 1$ (coefficient for $3^3$ )
- Remainder: $28 - (1 \times 27) = 28 - 27 = 1$
- Step 3: For $3^2 = 9$
- $1 / 9 = 0$ (coefficient for $3^2$ )
- Remainder: $1 - (0 \times 9) = 1$
- Step 4: For $3^1 = 3$
- $1 / 3 = 0$ (coefficient for $3^1$ )
- Remainder: $1 - (0 \times 3) = 1$
- Step 5: For $3^0 = 1$
- $1 / 1 = 1$ (coefficient for $3^0$ )
- Remainder: $1 - (1 \times 1) = 0$
- Result: Reading coefficients from $3^4$ down to $3^0$ gives $21001_{3}$ .

Division Method (Decimal to Base-3 Example)

Convert $190_{10}$ to base-3:
- Divide 190 by 3 repeatedly until the quotient is 0, tracking remainders.
- Example calculation:
- $190 \div 3 = 63$ remainder $1$
- $63 \div 3 = 21$ remainder $0$
- $21 \div 3 = 7$ remainder $0$
- $7 \div 3 = 2$ remainder $1$
- $2 \div 3 = 0$ remainder $2$
- Result: Read from bottom to top gives $190_{10} = 21001_{3}$ $

Exercises

Convert $458_{10}$ to binary.
Convert $652_{10}$ to binary.
Use manual conversion, not a calculator.

Conversion of Fractional Numbers

Fractional numbers can also be approximated in different bases. For example, $0.5$ is exactly representable in binary and decimal, but not in base-3.
- Decimal Example: $0.4710 = 4 \times 10^{-1} + 7 \times 10^{-2}$
- Binary Example: $0.112 = 1 \times 2^{-1} + 1 \times 2^{-2} = 0.5 + 0.25 = 0.75_{10}$ $

Methods for Fractional Conversion

Subtraction Method: Similar to integer fractions; start with the largest negative power of the radix.
Multiplication Method: Multiply by the radix. Keep track of the integer part and fractional part.

Example of Fraction Conversion Using Subtraction Method

Convert $0.8125_{10}$ to binary:
- Result is: $0.1100_{2}$ . Conversion stops when the remainder equals 0.

Example of Fraction Conversion Using Multiplication Method

Convert $0.8125_{10}$ to binary:
- Multiply by 2, ignore the integer part, continue multiplying until reaching the desired precision.
- Result: $0.1100_{2}$ $

2.4 Binary and Hexadecimal Number Representation

Binary to Hexadecimal Conversion: Binary strings (e.g., $11010100_{2}$ ) are represented as hexadecimal numbers for compactness.
- Each hexadecimal digit corresponds to 4 binary digits (nibble). Both conversions can be done by grouping bits into nibbles.

Exercises

Convert binary $11010100_{2}$ to hexadecimal.
Using hexadecimal to decimal conversion: Calculate $1234_{16}$ .

Overflow and Underflow in Binary Systems

Overflow occurs when the result exceeds the maximum representable value, while underflow happens with values too low to represent correctly.

Two’s Complement Representation

The two’s complement system is used for signed integers, where the MSB indicates the sign (0 for positive, 1 for negative).

Conversion into Two's Complement

Positive number: Same as its binary representation.
Negative number: Negate each bit, then add 1 to the result.

Example of Two's Complement Representation

To express decimal $-14$ :
1. Binary of $14$ is $00001110$ .
2. Negate (invert bits) gives $11110001$ .
3. Add 1 results in $11110010$ .

Addition and Subtraction in Two's Complement

Add the two's complement numbers, ignoring overflow. For subtraction, negate and add.

2.5 Floating-Point Representation

Used for real numbers with fractional components. Floating-point consists of three parts: sign, exponent, and significand (also called mantissa).

IEEE-754 Standard

Floating-point arithmetic uses formats defined in IEEE-754, which standardizes representation:
- Single precision: 1 bit sign, 8 bits exponent, 23 bits significand.
- Double precision: 1 bit sign, 11 bits exponent, 52 bits significand.

Conversion to IEEE-754 format

Normalize the number; e.g., decimal $-3.75$ is normalized to a binary format.
- Example: Converting $-3.75_{10}$ to IEEE-754 Single-Precision
1. Convert absolute value to binary: $3.75_{10} = 11.11_{2}$ .
2. Normalize: Move binary point: $11.11_{2} = 1.111 \times 2^1$ . (Exponent = $1$ )
3. Determine Sign bit: Since the number is negative ( $-3.75$ ), the sign bit is $1$ .
4. Calculate Biased Exponent: For single precision, bias = $127$ . Biased exponent = $1 + 127 = 128_{10} = 10000000_{2}$ .
5. Determine Significand (Mantissa): From $1.111 \times 2^1$ , the fractional part is $111$ . Pad with zeros to 23 bits: $11100000000000000000000_{2}$ .
6. Combine components: Sign ( $1$ ) | Exponent ( $10000000$ ) | Mantissa ( $11100000000000000000000$ ).
  Result (32-bit): $11000000011100000000000000000000_{2}$ .
Convert the exponent and significand, considering bias in exponent field. (Bias = 127 for single).

Floating-Point Errors

Understand that floating-point representations introduce approximations, which can compound errors in calculations. For example, small discrepancies can produce larger errors when operations are repeated.

Overflow and Underflow in Floating Point

Requires careful handling, as overflow can cause program crashes, and underflow can lead to ensuring precision and correctness in calculations.

Character Representation in Computers

ASCII is the most common character encoding scheme.
- Maps 128 characters to 7-bit binary.
- Supports numbers, letters, and punctuation.

Introduction to Unicode

Extends ASCII to cover more characters encountered in global writing systems. Recommended are UTF-8 and UTF-16.

Conclusion

Data in computers is stored in binary, hexadecimal, and using formats like two's complement for signed numbers, and floating-point for real numbers. For character representations, ASCII and Unicode play critical roles.