Computer Information Processing and Data Representation

Introduction to Computer Information Processing

A computer is a device designed for processing information, converting it from one form to another through data manipulation.

Data Representation

  • Modern computers utilize binary (0s and 1s) for data representation because it simplifies logic processing using basic operations like AND, OR, and NOT. Unlike analogue systems, binary reduces error rates caused by noise in data.

Number Systems

  • Numbers can be represented in various formats: tally marks, Roman numerals, Arabic numerals, and binary. All represent the same fundamental concept.
  • Decimal counting is based on powers of 10, which aligns with human counting systems (e.g., ten fingers). In contrast, computers count using binary, based on powers of 2.

Binary Counting

  • Binary counting operates similarly to decimal but relies on base 2 (units, 2s, 4s, 8s, etc.). Conversion between binary and decimal involves summing the powers of 2 represented by the binary digits.

Encoding Text

  • Computers need to translate text into bits for processing. While the method of encoding can vary, consistency is essential for software and hardware compatibility. Historical character encoding systems include Baudot code, EBCDIC, and ASCII, leading to the development of Unicode.

Character Encoding Evolution

  • Baudot code (5-bit) provided a base for early teleprinters but lacked flexibility. ASCII emerged in the 1960s as a standardized 7-bit encoding for 127 characters.
  • Unicode was introduced to accommodate a wider array of characters, starting as a 16-bit system but evolving to a maximum of 32 bits to encompass all character sets globally.

UTF-8 Encoding

  • UTF-8 offers a variable length encoding that is self-syncing, meaning its structure allows easy decoding and processing, optimizing storage needs. It aims to be more efficient than fixed-length encodings while facilitating the representation of a broad character range.