Chapter 2 Data Representation Study Notes
Chapter 2: Data Representation
Introduction
This chapter explains how data is represented in memory, including various bit models, bit operators, and compound data types.
Topics Covered:
Bit models (magnitude only, sign magnitude, two’s complement, fixed point, floating point)
ASCII and UNICODE representations
Bit operators for manipulating memory data
Compound data types such as Strings and Arrays (single and multi-dimensional)
Custom type definitions such as structs and unions
2.1 Number Representation and Bit Models
All data in a computer is represented numerically, regardless of its original form (numbers, characters, images, etc.).
Fundamental Concept: Data, in the digital world, is composed of ones and zeros (binary).
Examples of Binary Interpretation:
01000011 01001111 01010111 can represent:
Three unique integers: 67, 79, 87
A large number: 4,411,223
A word: COW
A color (RGB): Context is essential for interpretation.
Common Number Systems:
Binary: Base 2, digits 0 and 1 (e.g., 1001011)
Octal: Base 8, digits 0-7 (e.g., 113)
Decimal: Base 10, digits 0-9 (e.g., 75)
Hexadecimal: Base 16, digits 0-9 and A-F (e.g., 4B)
Conversions Between Number Bases: Conversion is akin to the decimal number system.
2.2 Base Calculation Example
Base Conversion of Binary Number 1001101
Base Conversion:
Base
Calculation
Decimal Value
2
(1 \times 2^0) + (0 \times 2^1) + (1 \times 2^2) + (1 \times 2^3) + (0 \times 2^4) + (0 \times 2^5) + (1 \times 2^6)
77
8
(1 \times 8^0) + (0 \times 8^1) + (1 \times 8^2) + (1 \times 8^3) + (0 \times 8^4) + (0 \times 8^5) + (1 \times 8^6)
262,721
10
(1 \times 10^0) + (0 \times 10^1) + (1 \times 10^2) + (1 \times 10^3) + (0 \times 10^4) + (0 \times 10^5) + (1 \times 10^6)
1,001,101
16
(1 \times 16^0) + (0 \times 16^1) + (1 \times 16^2) + (1 \times 16^3) + (0 \times 16^4) + (0 \times 16^5) + (1 \times 16^6)
16,781,569
2.3 Bit Models
Bit Definition: A bit is the smallest unit of data in a computer, representing either a 1 or a 0.
Nibble & Byte:
Nibble: Group of four bits
Byte: Group of eight bits (two nibbles)
Word Sizes: Words can be grouped into combinations of bytes, leading to word sizes of 16, 32, or 64 bits.
Bit Models: There are several methods to interpret sequences of bits (bit models). Here are six key models:
Magnitude-only Bit Model: Non-negative integers.
Sign-Magnitude Bit Model: Allows for negative numbers, using the most significant bit as a sign bit.
Two’s Complement Bit Model: Representation for both positive and negative whole numbers.
Fixed-Point Bit Model: Useful in specific applications but not commonly used in C.
Floating-Point Bit Model: Advanced representation for real numbers.
ASCII and Unicode Bit Model: Character representation.
2.4 Magnitude-only Bit Model
Overview: Designed for non-negative integers.
Range: Given an 8-bit value, the range is 0 to 255:
00000000 = 0
11111111 = 255
Binary addition is performed from least significant bit (LSB) to most significant bit (MSB).
Example: Adding two binary numbers (10 and 7):
00001010 + 00000111 -------- 00010001 (17 in decimal)2.5 Sign-Magnitude Bit Model
Definition: Represents positive and negative numbers.
Range: Uses one bit for the sign, allowing values from -127 to +127.
Issues: Two representations of zero (+0 and -0), conflicting behavior in arithmetic operations.
2.6 Two’s Complement Bit Model
Definition: A popular representation for signed integers in C.
Conversion Process: For negative numbers, invert bits and add 1.
Addition Example:
Example of (-10 + -7):
```plaintext
11110110 (negative 10)11111001 (negative 7)
overflow
If the result is unexpected (sign bit changes), overflow can occur. - **Range**: Signed integers range from -128 to +127 for 8 bits. ## 2.7 Recognizing Bit Models in C ### Unsigned and Signed Types - **Unsigned Types**: Use magnitude-only model. - **Signed Types**: Utilize sign-magnitude or two's complement. | Data Type | Bits Used | Bytes Used | Number Range | |-----------|-----------|------------|--------------| | unsigned char | 8 | 1 | 0 to 255 | | signed char | 8 | 1 | -128 to +127 | | unsigned int | 32 | 4 | 0 to 4,294,967,295 | | signed int | 32 | 4 | -2,147,483,648 to +2,147,483,647 | ## 2.8 Fixed-Point Bit Model - **Use**: Represents real numbers; less precision compared to floating point. Related to binary point positioning. - **Example**: - Representing 11010.101 can shift binary point position for different values. ## 2.9 Floating-Point Bit Model - **Purpose**: Flexible representation of real numbers. Often used in computations needing large ranges. - **Structure of Floating Point**: Divided into three components: sign, exponent, and mantissa. C data types: - float: 32 bits (1 sign, 8 exponent, 23 mantissa) - double: 64 bits (1 sign, 11 exponent, 52 mantissa) ## 2.10 ASCII and Unicode Bit Models - **ASCII**: Represents characters and control characters. 1 byte per character (0-255 range). - **Unicode**: Encompasses multiple languages and symbols, stored typically using up to 4 bytes. In C, often stored in `unsigned short`. ## 2.11 Bitwise Operations - **Bitwise Operators**: - `~`: Bitwise NOT (inverts bits) - `&`: Bitwise AND (combines bits) - `|`: Bitwise OR (merges bits) - `^`: Bitwise XOR (exclusive OR) - `<<`: Left shift (shifts bits left) - `>>`: Right shift (shifts bits right) ## 2.12 Compound Data Types - **Definition**: A data type that contains multiple values under a single name. - **Examples**: - Strings as character arrays or pointers - Arrays of elements (single and multi-dimensional) ## 2.13 Strings and Character Arrays - **Definition**: Strings in C are sequences of characters ending with a null character `\0`. - **Example**:c
char myString[] = "Hello";- **Caution**: No implicit bounds-checking; array overflows can occur. ## 2.14 Arrays - **Understanding Arrays**: Fixed-size storage for multiple values of the same type; indexed at zero. - **Multi-dimensional arrays**: Defined by multiple sets of indices, e.g., `int array[3][4];`. - **Memory Size Calculations**: Use `sizeof()` to determine total bytes used. - Example: `printf("Size of int: %zu ", sizeof(int));` ## 2.15 Custom Type Definitions: Structures and Unions - **Structures**: Grouping multiple variables, akin to classes in Java (without methods). - Example struct:c
struct Address {
char *street;
char *city;
};- **Unions**: Allow for multiple data types to share a single memory location, optimized for memory usage but risks overwriting data. - Example:c
union Data {
int i;
float f;
};
```Conclusion
Understanding data representation is critical for efficient programming and utilization of types in C. This chapter has outlined fundamental data representations, manipulation methods, and custom data types pertinent to computer memory management and programming efficiency.