CS-150 – Data Representation: Vocabulary Flashcards
Analog vs Digital
Analog = continuous, Digital = discrete
Digitise: convert analog signal to binary
Discretise: map continuous space to finite set
Binary Basics
Bit: 0/1; Byte = 8 bits; Word = machine’s native multiple
Capacity: values
Required bits: for symbols
Integer Representation
Unsigned: direct binary
Sign-magnitude: MSB = sign; two zeros ⇒ rarely used
Two’s complement (standard)
Range
Negate: invert bits +
Addition/subtraction as normal; overflow when result outside range
Fixed-Size Int Ranges (signed / unsigned)
8-bit: /
16-bit: /
32-bit: /
64-bit: /
Real Numbers & Floating Point
Binary fractions use positions
Floating-point form:
IEEE 754
Single (float): 1 sign, 8 biased exponent (bias ), 23 fraction bits
Double: 1 sign, 11 exponent (bias ), 52 fraction bits
Fixed-point: constant digits right of radix; used in accounting
Conversion algorithms: repeated divide (integer part) / multiply (fractional part)
Scientific Notation
Keeps one non-zero digit left of point; e.g.
Text Encoding
Map characters → bit patterns
Standards: ASCII (7-bit), EBCDIC, ISO-8859-1, Unicode (UTF-8/16/32, >143\,000 chars)
Colour Representation
RGB triplet, usually 8 bits/channel (24-bit colour)
Example brown: (150,75,0) \to #964B00
Other models: HSV, CMY, CMYK
Colour depth = bits per channel
Images
Pixel = coloured dot; Resolution = pixel count
Raster (BMP, GIF, JPEG, PNG, TIFF): store every pixel; scaling ⇒ pixelation
Vector (SVG): store shapes; scale cleanly; small for line art
JPEG: lossy; transform to frequency domain & discard high-frequency components
Audio
Sound digitised by PCM: sample & quantise
Nyquist: sample ≥ ; CD: Hz, 16-bit stereo
Formats: Uncompressed (WAV, AIFF), Lossless (FLAC, ALAC), Lossy (MP3, AAC, Ogg)
Video
Codec = encoder/decoder, usually lossy, uses temporal + spatial compression
Formats: MPEG-2, MPEG-4, AVI, WebM, Matroska
Data Compression Fundamentals
Compression ratio (smaller = better)
Lossless vs Lossy (text → lossless; media often lossy)
Lossless Techniques
Run-length:
Keyword: replace frequent words with single tokens
Huffman coding: variable-length prefix codes built from symbol frequencies; optimal among prefix codes
Storage Example
24-bit RGB image 1920×1080: uncompressed → compression essential