Chapter 2 - Data Representation

Chapter 2 - Data Representation

Chapter 2 Objectives
  • Understand the fundamentals of numerical data representation and manipulation in computer systems.
  • Master the skill of converting between different numeric-radix systems.
  • Understand how errors can occur in computations because of overflow and truncation.
  • Understand the fundamental concepts of floating-point representation.
  • Gain familiarity with the most popular character codes.
Outline
  • Converting between different numeric-radix systems
  • Binary addition and subtraction
  • Two’s complement representation
  • Floating-point representation
  • Characters in computer systems

2.1 Introduction

  • Bit: The most basic unit of information in a computer, representing a state of either "on" or "off", "high" or "low" voltage in a digital circuit. In binary, it can be either "1" or "0".
  • Byte: A group of 8 bits.
    • The smallest possible unit of storage in computer systems.
  • Nibble: A group of 4 bits; a byte consists of 2 nibbles: the "high-order" nibble and the "low-order" nibble.
  • Word: A contiguous group of bytes (size can vary between computer systems, e.g., 2 bytes (16 bits), 4 bytes (32 bits), 8 bytes (64 bits)).
  • Most Significant Bit (MSB) and Least Significant Bit (LSB)
    • For binary representation: 1011100110101011
Byte or Word Addressable
  • A computer may allow either a byte or word to be addressable. Definition of addressable: a particular unit of storage retrieved by the CPU based on its memory location.
    • In a byte-addressable system, the smallest addressable unit is a byte.
    • In a word-addressable system, it is a word.

2.2 Positional Numbering Systems

  • Bytes store numbers using the position of each bit to represent a power of 2 (radix of 2). The binary system is a base-2 system, while the decimal system is the base-10 system (powers of 10).
  • When the radix is different from 10, the base is denoted as a subscript (e.g., 94710947_{10}).
Example of Base-10 Representation
  • Decimal Number: 947
    • Detailed Representation: 947=9×102+4×101+7×100947 = 9 \times 10^2 + 4 \times 10^1 + 7 \times 10^0
  • Decimal Number: 5836.4710
    • Detailed Representation: 5836.4710=5×103+8×102+3×101+6×100+4×101+7×1025836.4710 = 5 \times 10^3 + 8 \times 10^2 + 3 \times 10^1 + 6 \times 10^0 + 4 \times 10^{-1} + 7 \times 10^{-2}$
Example of Base-2 Representation
  • Binary Number: 11001 (base-2)
    • Conversion to Decimal: 11001=1×24+1×23+0×22+0×21+1×20=251011001 = 1 \times 2^4 + 1 \times 2^3 + 0 \times 2^2 + 0 \times 2^1 + 1 \times 2^0 = 25_{10}$
Practice Questions
  • Convert (01111101)2=?(01111101)_2 = ?
  • Convert (123)10=?(123)_10 = ?
  • Convert (123)3=?(123)_3 = ?
Importance of Binary System
  • Binary numbers underpin all data representations in computer systems.
  • Proficiency with binary is essential for understanding computer operations and instruction sets.

2.3 Converting Between Bases

  • Conversion of base-10 to other radices: Two primary methods for conversion are:
    1. Subtraction Method
    2. Division Method
Subtraction Method (Decimal to Base-3 Example)
  • Convert 19010190_{10} to base-3:
    • Powers of 3: 30=13^0=1, 31=33^1=3, 32=93^2=9, 33=273^3=27, 34=813^4=81, 35=2433^5=243
    • Step 1: For 34=813^4 = 81
    • 190/81=2190 / 81 = 2 (coefficient for 343^4)
    • Remainder: 190(2×81)=190162=28190 - (2 \times 81) = 190 - 162 = 28
    • Step 2: For 33=273^3 = 27
    • 28/27=128 / 27 = 1 (coefficient for 333^3)
    • Remainder: 28(1×27)=2827=128 - (1 \times 27) = 28 - 27 = 1
    • Step 3: For 32=93^2 = 9
    • 1/9=01 / 9 = 0 (coefficient for 323^2)
    • Remainder: 1(0×9)=11 - (0 \times 9) = 1
    • Step 4: For 31=33^1 = 3
    • 1/3=01 / 3 = 0 (coefficient for 313^1)
    • Remainder: 1(0×3)=11 - (0 \times 3) = 1
    • Step 5: For 30=13^0 = 1
    • 1/1=11 / 1 = 1 (coefficient for 303^0)
    • Remainder: 1(1×1)=01 - (1 \times 1) = 0
    • Result: Reading coefficients from 343^4 down to 303^0 gives 21001321001_{3}.
Division Method (Decimal to Base-3 Example)
  • Convert 19010190_{10} to base-3:
    • Divide 190 by 3 repeatedly until the quotient is 0, tracking remainders.
    • Example calculation:
    • 190÷3=63190 \div 3 = 63 remainder 11
    • 63÷3=2163 \div 3 = 21 remainder 00
    • 21÷3=721 \div 3 = 7 remainder 00
    • 7÷3=27 \div 3 = 2 remainder 11
    • 2÷3=02 \div 3 = 0 remainder 22
    • Result: Read from bottom to top gives 19010=210013190_{10} = 21001_{3}$
Exercises
  • Convert 45810458_{10} to binary.
  • Convert 65210652_{10} to binary.
  • Use manual conversion, not a calculator.
Conversion of Fractional Numbers
  • Fractional numbers can also be approximated in different bases. For example, 0.50.5 is exactly representable in binary and decimal, but not in base-3.
    • Decimal Example: 0.4710=4×101+7×1020.4710 = 4 \times 10^{-1} + 7 \times 10^{-2}
    • Binary Example: 0.112=1×21+1×22=0.5+0.25=0.75100.112 = 1 \times 2^{-1} + 1 \times 2^{-2} = 0.5 + 0.25 = 0.75_{10}$
Methods for Fractional Conversion
  1. Subtraction Method: Similar to integer fractions; start with the largest negative power of the radix.
  2. Multiplication Method: Multiply by the radix. Keep track of the integer part and fractional part.
Example of Fraction Conversion Using Subtraction Method
  • Convert 0.8125100.8125_{10} to binary:
    • Result is: 0.110020.1100_{2}. Conversion stops when the remainder equals 0.
Example of Fraction Conversion Using Multiplication Method
  • Convert 0.8125100.8125_{10} to binary:
    • Multiply by 2, ignore the integer part, continue multiplying until reaching the desired precision.
    • Result: 0.110020.1100_{2}$

2.4 Binary and Hexadecimal Number Representation

  • Binary to Hexadecimal Conversion: Binary strings (e.g., 11010100211010100_{2}) are represented as hexadecimal numbers for compactness.
    • Each hexadecimal digit corresponds to 4 binary digits (nibble). Both conversions can be done by grouping bits into nibbles.
Exercises
  • Convert binary 11010100211010100_{2} to hexadecimal.
  • Using hexadecimal to decimal conversion: Calculate 1234161234_{16}.
Overflow and Underflow in Binary Systems
  • Overflow occurs when the result exceeds the maximum representable value, while underflow happens with values too low to represent correctly.
Two’s Complement Representation
  • The two’s complement system is used for signed integers, where the MSB indicates the sign (0 for positive, 1 for negative).
Conversion into Two's Complement
  • Positive number: Same as its binary representation.
  • Negative number: Negate each bit, then add 1 to the result.
Example of Two's Complement Representation
  • To express decimal 14-14:
    1. Binary of 1414 is 0000111000001110.
    2. Negate (invert bits) gives 1111000111110001.
    3. Add 1 results in 1111001011110010.
Addition and Subtraction in Two's Complement
  • Add the two's complement numbers, ignoring overflow. For subtraction, negate and add.

2.5 Floating-Point Representation

  • Used for real numbers with fractional components. Floating-point consists of three parts: sign, exponent, and significand (also called mantissa).
IEEE-754 Standard
  • Floating-point arithmetic uses formats defined in IEEE-754, which standardizes representation:
    • Single precision: 1 bit sign, 8 bits exponent, 23 bits significand.
    • Double precision: 1 bit sign, 11 bits exponent, 52 bits significand.
Conversion to IEEE-754 format
  1. Normalize the number; e.g., decimal 3.75-3.75 is normalized to a binary format.
    • Example: Converting 3.7510-3.75_{10} to IEEE-754 Single-Precision
    1. Convert absolute value to binary: 3.7510=11.1123.75_{10} = 11.11_{2}.
    2. Normalize: Move binary point: 11.112=1.111×2111.11_{2} = 1.111 \times 2^1. (Exponent = 11)
    3. Determine Sign bit: Since the number is negative (3.75-3.75), the sign bit is 11.
    4. Calculate Biased Exponent: For single precision, bias = 127127. Biased exponent = 1+127=12810=1000000021 + 127 = 128_{10} = 10000000_{2}.
    5. Determine Significand (Mantissa): From 1.111×211.111 \times 2^1, the fractional part is 111111. Pad with zeros to 23 bits: 11100000000000000000000211100000000000000000000_{2}.
    6. Combine components: Sign (11) | Exponent (1000000010000000) | Mantissa (1110000000000000000000011100000000000000000000).
      Result (32-bit): 11000000011100000000000000000000211000000011100000000000000000000_{2}.
  2. Convert the exponent and significand, considering bias in exponent field. (Bias = 127 for single).
Floating-Point Errors
  • Understand that floating-point representations introduce approximations, which can compound errors in calculations. For example, small discrepancies can produce larger errors when operations are repeated.
Overflow and Underflow in Floating Point
  • Requires careful handling, as overflow can cause program crashes, and underflow can lead to ensuring precision and correctness in calculations.
Character Representation in Computers
  • ASCII is the most common character encoding scheme.
    • Maps 128 characters to 7-bit binary.
    • Supports numbers, letters, and punctuation.
Introduction to Unicode
  • Extends ASCII to cover more characters encountered in global writing systems. Recommended are UTF-8 and UTF-16.
Conclusion
  • Data in computers is stored in binary, hexadecimal, and using formats like two's complement for signed numbers, and floating-point for real numbers. For character representations, ASCII and Unicode play critical roles.