Character Sets: ASCII and Unicode

A character set is the complete list of binary codes a computer can recognise & process.
Each character (letter, digit, symbol, control code) is mapped to a unique numeric value that is stored as binary.

Uses $7$ bits per character ⇒ $128$ possible codes $\left(0!\text{–}!127\right)$ .
Key groupings (sequential):
• Capital letters $A!\text{–}!Z:65\text{–}90$
• Lowercase $a!\text{–}!z:97\text{–}122$
• Digits $0!\text{–}!9:48\text{–}57$
Example mappings:
• $A=65=1000001<em>2$ • $G=71=01000111</em>2$
• $*=42=00101010_2$
Codes are consecutive, so knowing one lets you calculate others (e.g. $E=65+4=69$ ).
Limitations: only $128$ characters; no support for accented letters, non-Latin scripts, emoji.

A universal character set covering virtually every written language & symbol.
Original size: $16$ bits (BMP) ⇒ $65\,536$ codes. Modern UTF encodings scale to over $2\,147\,483\,647$ characters.
First $128$ codes mirror ASCII for compatibility.
Supports scripts such as Greek, Mandarin, Japanese, plus emoji & technical symbols.

Far larger code space; accommodates multilingual text and pictographs.
Consistent cross-platform standard.
Essential for devices/apps that handle diverse languages or emoji (e.g., smartphones).

Character set = set of binary codes understood by hardware & software.
ASCII: $7$ bits, $128$ characters; understand control vs printable ranges.
Unicode: $16$ + bits, millions of characters; required for global language support.
Character codes are grouped & run in sequence—use this to convert or infer values.
Practise conversions: denary $\leftrightarrow$ binary for ASCII codes.
Be ready to justify why Unicode is chosen over ASCII when additional symbols/languages are needed.