# Data

### Storage Units and Binary Numbers

Binary Basics

• Binary is a base-2 numbering system used in computers, where each digit can be either 0 or 1.

• Binary is ideal for representing electronic devices' on/off states.

• Binary digits are called "bits," and they form the foundation of digital data.

Bytes and Storage Units

• A byte is composed of 8 bits, and it's the basic unit for data storage and processing in computers.

• 1 Kilobyte (KB) is approximately 1024 bytes. Kilobytes are commonly used to measure small data files.

• 1 Megabyte (MB) is approximately 1024 KB or 1,048,576 bytes.

• 1 Gigabyte (GB) is approximately 1024 MB or 1,073,741,824 bytes.

• 1 Terabyte (TB) equals about 1024 GB or 1,099,511,627,776 bytes.

Binary Numbers

• Understanding binary arithmetic is essential. Binary numbers are based on powers of 2.

• For example, 2^0 (2 to the power of 0) is 1, 2^1 is 2, 2^2 is 4, and 2^3 is 8.

• These power values serve as the foundation for the binary number system.

• This knowledge is critical for tasks like converting between binary and decimal numbers.

• Binary arithmetic, like addition, resembles decimal addition but carries occur more frequently.

• Borrowing is a standard procedure when performing binary subtraction.

• These operations are fundamental in all computer calculations.

Binary Multiplication and Division

• Binary multiplication is a digit-by-digit process and is analogous to decimal multiplication.

• Binary division uses a repetitive subtraction technique.

• Both are crucial for core computational tasks.

• Hexadecimal (hex) is a base-16 numbering system.

• It consists of digits 0-9 and letters A-F, where A represents 10, B represents 11, and so on.

• Each hex digit maps to a 4-bit binary nibble, making it an efficient way to represent long binary numbers in a more human-readable form.

### Binary Shifts and Two's Complement

Binary Shifts

• Binary shifts are operations where bits are moved to the left or right.

• A left shift effectively multiplies a binary number by 2, whereas a right shift divides it by 2.

• Shift operations are fundamental in computer programming and data manipulation.

Two's Complement

• Two's complement is a system used to represent negative numbers in binary.

• In this system, the leftmost bit represents the sign, with 0 indicating positive and 1 indicating negative.

• To find the two's complement of a binary number, you invert all the bits (changing 0s to 1s and vice versa) and then add 1 to the result.

• This system is essential for accurately representing and performing calculations with negative numbers in digital systems.

### ASCII

ASCII Basics

• ASCII, or the American Standard Code for Information Interchange, is a character encoding scheme.

• It assigns a unique 7-bit binary code to represent various characters, including uppercase and lowercase letters, numbers, symbols, and control characters.

• This standard ensures that computers can communicate and store text-based data consistently.

Applications

• ASCII plays a crucial role in data storage, data processing, and data transmission, particularly for text-based information.

• It is a fundamental component of communication systems, including email, chat applications, and programming languages.

• Extended ASCII uses 8 bits to represent 256 different characters, expanding the range of supported symbols as seen below.

 Decimal Binary Character Decimal Binary Character Decimal Binary Character 32 00100000 space 64 01000000 @ 96 01100000 ' 33 00100001 ! 65 01000001 A 97 01100001 a 34 00100010 " 66 01000010 B 98 01100010 b 35 00100011 £ 67 01000011 C 99 01100011 c 36 00100100 \$ 68 01000100 D 100 01100100 d 37 00100101 % 69 01000101 E 101 01100101 e 38 00100110 & 70 01000110 F 102 01100110 f 39 00100111 ' 71 01000111 G 103 01100111 g 40 00101000 ( 72 01001000 H 104 01101000 h 41 00101001 ) 73 01001001 I 105 01101001 i 42 00101010 * 74 01001010 J 106 01101010 j 43 00101011 + 75 01001011 K 107 01101011 k 44 00101100 , 76 01001100 L 108 01101100 l 45 00101101 - 77 01001101 M 109 01101101 m 46 00101110 . 78 01001110 N 110 01101110 n 47 00101111 / 79 01001111 O 111 01101111 o 48 00110000 0 80 01010000 P 112 01110000 p 49 00110001 1 81 01010001 Q 113 01110001 q 50 00110010 2 82 01010010 R 114 01110010 r 51 00110011 3 83 01010011 S 115 01110011 s 52 00110100 4 84 01010100 T 116 01110100 t 53 00110101 5 85 01010101 U 117 01110101 u 54 00110110 6 86 01010110 V 118 01110110 v 55 00110111 7 87 01010111 W 119 01110111 w 56 00111000 8 88 01011000 X 120 01111000 x 57 00111001 9 89 01011001 Y 121 01111001 y 58 00111010 : 90 01011010 Z 122 01111010 z 59 00111011 ; 91 01011011 [ 123 01111011 { 60 00111100 < 92 01011100 \ 124 01111100 | 61 00111101 = 93 01011101 ] 125 01111101 } 62 00111110 > 94 01011110 ^ 126 01111110 ~ 63 00111111 ? 95 01011111 _ 127 01111111 del

### Images

Pixels

• Digital images are constructed from individual picture elements or pixels. These pixels are the smallest units in an image.

• More pixels result in higher resolution, allowing for more detailed and clearer images.

• A combination of pixels of different colours (commonly red, green, and blue, or RGB) creates the full spectrum of colours in an image.

Colour Depth

• Colour depth represents the number of bits assigned to each pixel in an image. It determines how many different colours can be displayed.

• For example, an image with 8 bits of colour depth can display 256 different colours, while 24-bit colour depth can show over 16 million colours.

• Higher colour depth leads to more accurate and vibrant colour representation.

Image Formats

• Different image formats are used for various purposes:

• JPEG (Joint Photographic Experts Group): Uses lossy compression to balance image quality and file size, making it suitable for photographs.

• PNG (Portable Network Graphics): Utilises lossless compression and supports transparency, making it ideal for graphics with sharp edges and clear backgrounds.

• GIF (Graphics Interchange Format): Suitable for simple animations and supports transparency. GIFs use lossless compression.

### Sound

Digital Audio

• Sound is typically represented in digital systems through the process of digitisation.

• Sound waves are sampled, with each sample being assigned a binary value to create a digital representation.

• The sampling rate, measured in Hertz (Hz), defines the number of samples taken per second.

Audio Formats

• Different audio formats are used for various applications:

• WAV (Waveform Audio File Format): An uncompressed audio format, providing high quality but consuming more storage space. Commonly used for audio recording.

• MP3 (MPEG-1 Audio Layer III): Utilises lossy compression to significantly reduce file size while maintaining acceptable audio quality. Widely used for music storage and streaming.

### Compression

Lossless vs. Lossy

• Data compression is essential for optimising storage space and speeding up data transmission.

• Two primary compression types are used:

• Lossless compression (e.g., ZIP): Retains all original data, making it ideal for text and program files where every bit of data must be preserved.

• Lossy compression (e.g., JPEG for images, MP3 for audio): Sacrifices some data in exchange for higher compression rates. It is commonly used for multimedia, web content, and network transmission.

Run Length Encoding (RLE)

• RLE looks at the data in a file for consecutive runs of the same data. These runs are stored as one item of data, instead of many.

• Example: The data for a file is 00 00 00 11 11 11 11 00 00 00, which is ten data values of two characters each, giving 20 characters in total. RLE looks for the consecutive runs of each data (0 or 1), and records what the data is and how many times it repeats/occurs. These values are stored instead of the original data, making the file size smaller, while maintaining quality.

• This is a form of Lossless compression because no data is lost, only grouped in repeating sections.

So:

• 00000011111111000000 - 20 characters - becomes

• 608160 - 6 characters