Chapter 2 - Data Representation
Chapter 2 - Data Representation
Chapter 2 Objectives
- Understand the fundamentals of numerical data representation and manipulation in computer systems.
- Master the skill of converting between different numeric-radix systems.
- Understand how errors can occur in computations because of overflow and truncation.
- Understand the fundamental concepts of floating-point representation.
- Gain familiarity with the most popular character codes.
Outline
- Converting between different numeric-radix systems
- Binary addition and subtraction
- Two’s complement representation
- Floating-point representation
- Characters in computer systems
2.1 Introduction
- Bit: The most basic unit of information in a computer, representing a state of either "on" or "off", "high" or "low" voltage in a digital circuit. In binary, it can be either "1" or "0".
- Byte: A group of 8 bits.
- The smallest possible unit of storage in computer systems.
- Nibble: A group of 4 bits; a byte consists of 2 nibbles: the "high-order" nibble and the "low-order" nibble.
- Word: A contiguous group of bytes (size can vary between computer systems, e.g., 2 bytes (16 bits), 4 bytes (32 bits), 8 bytes (64 bits)).
- Most Significant Bit (MSB) and Least Significant Bit (LSB)
- For binary representation: 1011100110101011
Byte or Word Addressable
- A computer may allow either a byte or word to be addressable. Definition of addressable: a particular unit of storage retrieved by the CPU based on its memory location.
- In a byte-addressable system, the smallest addressable unit is a byte.
- In a word-addressable system, it is a word.
2.2 Positional Numbering Systems
- Bytes store numbers using the position of each bit to represent a power of 2 (radix of 2). The binary system is a base-2 system, while the decimal system is the base-10 system (powers of 10).
- When the radix is different from 10, the base is denoted as a subscript (e.g., 947_{10}).
Example of Base-10 Representation
- Decimal Number: 947
- Detailed Representation: 947 = 9 \times 10^2 + 4 \times 10^1 + 7 \times 10^0
- Decimal Number: 5836.4710
- Detailed Representation: 5836.4710 = 5 \times 10^3 + 8 \times 10^2 + 3 \times 10^1 + 6 \times 10^0 + 4 \times 10^{-1} + 7 \times 10^{-2}$
Example of Base-2 Representation
- Binary Number: 11001 (base-2)
- Conversion to Decimal: 11001 = 1 \times 2^4 + 1 \times 2^3 + 0 \times 2^2 + 0 \times 2^1 + 1 \times 2^0 = 25_{10}$
Practice Questions
- Convert (01111101)_2 = ?
- Convert (123)_10 = ?
- Convert (123)_3 = ?
Importance of Binary System
- Binary numbers underpin all data representations in computer systems.
- Proficiency with binary is essential for understanding computer operations and instruction sets.
2.3 Converting Between Bases
- Conversion of base-10 to other radices: Two primary methods for conversion are:
- Subtraction Method
- Division Method
Subtraction Method (Decimal to Base-3 Example)
- Convert 190_{10} to base-3:
- Powers of 3: 3^0=1, 3^1=3, 3^2=9, 3^3=27, 3^4=81, 3^5=243
- Step 1: For 3^4 = 81
- 190 / 81 = 2 (coefficient for 3^4)
- Remainder: 190 - (2 \times 81) = 190 - 162 = 28
- Step 2: For 3^3 = 27
- 28 / 27 = 1 (coefficient for 3^3)
- Remainder: 28 - (1 \times 27) = 28 - 27 = 1
- Step 3: For 3^2 = 9
- 1 / 9 = 0 (coefficient for 3^2)
- Remainder: 1 - (0 \times 9) = 1
- Step 4: For 3^1 = 3
- 1 / 3 = 0 (coefficient for 3^1)
- Remainder: 1 - (0 \times 3) = 1
- Step 5: For 3^0 = 1
- 1 / 1 = 1 (coefficient for 3^0)
- Remainder: 1 - (1 \times 1) = 0
- Result: Reading coefficients from 3^4 down to 3^0 gives 21001_{3}.
Division Method (Decimal to Base-3 Example)
- Convert 190_{10} to base-3:
- Divide 190 by 3 repeatedly until the quotient is 0, tracking remainders.
- Example calculation:
- 190 \div 3 = 63 remainder 1
- 63 \div 3 = 21 remainder 0
- 21 \div 3 = 7 remainder 0
- 7 \div 3 = 2 remainder 1
- 2 \div 3 = 0 remainder 2
- Result: Read from bottom to top gives 190_{10} = 21001_{3}$
Exercises
- Convert 458_{10} to binary.
- Convert 652_{10} to binary.
- Use manual conversion, not a calculator.
Conversion of Fractional Numbers
- Fractional numbers can also be approximated in different bases. For example, 0.5 is exactly representable in binary and decimal, but not in base-3.
- Decimal Example: 0.4710 = 4 \times 10^{-1} + 7 \times 10^{-2}
- Binary Example: 0.112 = 1 \times 2^{-1} + 1 \times 2^{-2} = 0.5 + 0.25 = 0.75_{10}$
Methods for Fractional Conversion
- Subtraction Method: Similar to integer fractions; start with the largest negative power of the radix.
- Multiplication Method: Multiply by the radix. Keep track of the integer part and fractional part.
Example of Fraction Conversion Using Subtraction Method
- Convert 0.8125_{10} to binary:
- Result is: 0.1100_{2}. Conversion stops when the remainder equals 0.
Example of Fraction Conversion Using Multiplication Method
- Convert 0.8125_{10} to binary:
- Multiply by 2, ignore the integer part, continue multiplying until reaching the desired precision.
- Result: 0.1100_{2}$
2.4 Binary and Hexadecimal Number Representation
- Binary to Hexadecimal Conversion: Binary strings (e.g., 11010100_{2}) are represented as hexadecimal numbers for compactness.
- Each hexadecimal digit corresponds to 4 binary digits (nibble). Both conversions can be done by grouping bits into nibbles.
Exercises
- Convert binary 11010100_{2} to hexadecimal.
- Using hexadecimal to decimal conversion: Calculate 1234_{16}.
Overflow and Underflow in Binary Systems
- Overflow occurs when the result exceeds the maximum representable value, while underflow happens with values too low to represent correctly.
Two’s Complement Representation
- The two’s complement system is used for signed integers, where the MSB indicates the sign (0 for positive, 1 for negative).
Conversion into Two's Complement
- Positive number: Same as its binary representation.
- Negative number: Negate each bit, then add 1 to the result.
Example of Two's Complement Representation
- To express decimal -14:
- Binary of 14 is 00001110.
- Negate (invert bits) gives 11110001.
- Add 1 results in 11110010.
Addition and Subtraction in Two's Complement
- Add the two's complement numbers, ignoring overflow. For subtraction, negate and add.
2.5 Floating-Point Representation
- Used for real numbers with fractional components. Floating-point consists of three parts: sign, exponent, and significand (also called mantissa).
IEEE-754 Standard
- Floating-point arithmetic uses formats defined in IEEE-754, which standardizes representation:
- Single precision: 1 bit sign, 8 bits exponent, 23 bits significand.
- Double precision: 1 bit sign, 11 bits exponent, 52 bits significand.
- Normalize the number; e.g., decimal -3.75 is normalized to a binary format.
- Example: Converting -3.75_{10} to IEEE-754 Single-Precision
- Convert absolute value to binary: 3.75_{10} = 11.11_{2}.
- Normalize: Move binary point: 11.11_{2} = 1.111 \times 2^1. (Exponent = 1)
- Determine Sign bit: Since the number is negative (-3.75), the sign bit is 1.
- Calculate Biased Exponent: For single precision, bias = 127. Biased exponent = 1 + 127 = 128_{10} = 10000000_{2}.
- Determine Significand (Mantissa): From 1.111 \times 2^1, the fractional part is 111. Pad with zeros to 23 bits: 11100000000000000000000_{2}.
- Combine components: Sign (1) | Exponent (10000000) | Mantissa (11100000000000000000000).
Result (32-bit): 11000000011100000000000000000000_{2}.
- Convert the exponent and significand, considering bias in
exponent field. (Bias = 127 for single).
Floating-Point Errors
- Understand that floating-point representations introduce approximations, which can compound errors in calculations. For example, small discrepancies can produce larger errors when operations are repeated.
Overflow and Underflow in Floating Point
- Requires careful handling, as overflow can cause program crashes, and underflow can lead to ensuring precision and correctness in calculations.
Character Representation in Computers
- ASCII is the most common character encoding scheme.
- Maps 128 characters to 7-bit binary.
- Supports numbers, letters, and punctuation.
Introduction to Unicode
- Extends ASCII to cover more characters encountered in global writing systems. Recommended are UTF-8 and UTF-16.
Conclusion
- Data in computers is stored in binary, hexadecimal, and using formats like two's complement for signed numbers, and floating-point for real numbers. For character representations, ASCII and Unicode play critical roles.