Last time, we started a new set of slides on cryptography and its applications.
We defined some terms and discussed cryptographic keys.
We explored the transposition encryption method with an example:
Plain text: "security begins with you"
Keyword: "teacups"
We demonstrated how to encrypt the plain text into ciphertext using the keyword.
We also showed how to reverse the process to recover the plain text from the ciphertext using the same keyword.
Two basic encryption methods:
Transposition method (covered last time)
Substitution cipher (to be covered now)
Example:
Plain text: "the butler did it"
Cipher: Rotation 13 (Rot13)
Basic Method:
Write out the first half of the alphabet (A-M) on one row.
Write out the second half of the alphabet (N-Z) on the row beneath it.
Substitute letters in the plain text with the letter above or below in the two-row alphabet.
Alphabet setup:
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
Plain text: "the butler did it"
Encryption:
T becomes G
H becomes U
E becomes R
B becomes O
U becomes H
T becomes G
L becomes Y
E becomes R
R becomes E
D becomes Q
I becomes V
D becomes Q
I becomes V
T becomes G
The ciphertext is "GU RE OHTE QVG".
Decryption:
Reverse the process using the same two-row table.
G becomes T, U becomes H, R becomes E, and so on.
Shift ciphers can be broken using frequency analysis.
In the English language, letters do not appear with equal frequency.
e appears 12.7% of the time.
z appears 0.1% of the time.
A shift cipher involves shifting each letter by k places.
If k = 3, A becomes D, and so on.
Frequency analysis:
Plot a bar chart of letter frequencies in the ciphertext.
Compare it to the known frequency distribution of letters in the English language.
Look for a shifted version of the standard distribution.
Exhaustive search:
Try every possible shift (k = 1 to 26).
Ciphertext's frequency analysis might not exactly match the expected distribution due to the limited size and specific content of the text.
The most frequent letter in the English language is E.
Identify the most frequent letter in the ciphertext.
Determine the shift required to align the most frequent ciphertext letter with E.
Also consider plain text letter A and possible shifts based on analysis to help narrow down possibilities.
By comparing the two graphs, we can estimate the shift amount.
If G is the highest column and is considered the letter E, then the shift is 13.
If a shift of k = 13 yields plain English, the encryption is cracked.
Example: The plain text obtained is "to be or not to be that is the question."
A different encryption method.
Key is used to shift the alphabet a certain number of places, wrapping around if needed.
Write out the alphabet and number each letter (starting from 0).
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Keyword: "math"
Plain text: "make it happen"
Determine numerical values for each letter in the keyword:
M = 12
A = 0
T = 19
H = 7
Repeat the keyword's numerical sequence beneath the plain text:
make it happen
12 0 19 7 12 0 19 7 12 0 19
Shift each letter in the plain text by the corresponding number:
M shifted 12 places becomes Y.
A shifted 0 places remains A.
K shifted 19 places becomes D.
E shifted 7 places becomes L.
(and so on…)
Ciphertext: "YA DL UT AHB P XU"
Ciphertext: "I know it when i see it"
Keyword: "find"
Determine numerical values for each letter in the keyword:
F = 5
I = 8
N = 13
D = 3
Repeat the keyword's numerical sequence beneath the ciphertext:
I know it when i see it
5 8 13 3 5 8 13 3 5 8 13 3 5 8
Shift each letter in the ciphertext backwards by the corresponding number to recover the plain text.
I shifted back 5 places becomes D.
S shifted back 8 places becomes O.
Decoding continues
A system that has not been broken is considered strong.
Ciphertext should appear random to standard statistical tests to be secure.
The number of possible keys is 26.
Encryption can be broken using an exhaustive search (trying every possible shift).
An alternative method involves using a random key permutation for each letter in the alphabet.
Number of possible keys increases dramatically to 26 factorial (26!).
More difficult to attack via brute force but can still be broken using frequency statistics and facts of the English language.
Mary Queen of Scots' plot was unmasked via codebook analysis.
Elizabeth I's agents decoded messages by guessing contents and testing hypotheses.
Agents identified and arrested plotters.
Mary was convicted of high treason based on decoded letters and later executed.
Encryption requires a key to be used in conjunction with the cipher.
Two possibilities:
Symmetric Key Cryptography: The same key is used for encryption and decryption.
Asymmetric Key Cryptography: A key pair is used, with one key for encryption and a different key for decryption.
Used in ATM machines.
Examples:
Data Encryption Standard (DES): 64-bit key.
Triple DES: 128-bit key.
Advanced Encryption Standard (AES): supersedes DES, using 128-bit key (or larger).
Block ciphers: plain text is divided into blocks, and each block is encrypted separately.
Encryption key is different from the decryption key.
Requires a public key and a private key.
The private key must be kept secret.
The public key can be shared with intended recipients.
Ensures messages are from their advertised source.
Can be used to form digitally signed communications.
Digital signing means the communication is verified to be from who it claims to be from and has not been altered.
Hashing involves transforming data into a distilled form called a message digest (or hash).
Hashing is not reversible (one-way process).
Typical algorithms:
Secure Hashing Algorithm (SHA), e.g., SHA-1.
Message Digest 5 (MD5).
Fixed length: the hash output is always the same length, regardless of the input data size.
Unique to the data: different messages produce different hashes (with very high probability).
Impossible to recover the original message from the message digest.
Creating digitally signed documents.
Verifying downloaded code:
Calculate the hash of the downloaded software.
Compare it with the hash provided by the software supplier.
Different hashes indicate corrupted data.
Forensics investigation:
Calculate the MD5 hash of the evidential data and its bit-for-bit copy.
Ensure the hashes are identical.
Third parties can replicate the analysis by verifying the hash of the data they're working on.
Using SHA-1, the probability of two different messages producing the same hash is 1 in 10^{48}, which is practically zero.
Fixed Length
Data uniqueness
Irreversible
Last time, we started a new set of slides on cryptography and its applications.
We defined some terms and discussed cryptographic keys. Key size matters because larger keys provide more possible combinations, making it exponentially harder for attackers to guess the key through brute force methods. For example, doubling the key length from n bits to 2n bits increases the number of possible keys from 2^n to 2^{2n}, a massive increase in security.
We explored the transposition encryption method with an example: - Plain text: "security begins with you"
Keyword: "teacups"
We demonstrated how to encrypt the plain text into ciphertext using the keyword. Transposition ciphers rearrange the order of characters in the plaintext to form the ciphertext. The key determines the specific rearrangement. This method's strength depends on the complexity of the transposition algorithm and key length.
We also showed how to reverse the process to recover the plain text from the ciphertext using the same keyword. Decryption involves reversing the transposition using the same key. If the key is lost or compromised, decrypting the message becomes difficult or impossible.
Two basic encryption methods: - Transposition method (covered last time)
Substitution cipher (to be covered now)
Example: - Plain text: "the butler did it"
Cipher: Rotation 13 (Rot13)
Basic Method: - Write out the first half of the alphabet (A-M) on one row.
Write out the second half of the alphabet (N-Z) on the row beneath it.
Substitute letters in the plain text with the letter above or below in the two-row alphabet.
Alphabet setup:
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
Plain text: "the butler did it"
Encryption:
T becomes G
H becomes U
E becomes R
B becomes O
U becomes H
T becomes G
L becomes Y
E becomes R
R becomes E
D becomes Q
I becomes V
D becomes Q
I becomes V
T becomes G
The ciphertext is "GU RE OHTE QVG".
Decryption:
Reverse the process using the same two-row table.
G becomes T, U becomes H, R becomes E, and so on.
Decryption is the reverse process of encryption. In Rot13, the same algorithm and key are used for both encryption and decryption, making it a reciprocal cipher.
Shift ciphers can be broken using frequency analysis. Frequency analysis involves examining the frequency of letters in the ciphertext and comparing it to the known frequency distribution of letters in the English language. For example, in English texts:
e appears 12.7% of the time.
z appears 0.1% of the time.
A shift cipher involves shifting each letter by k places. - If k = 3, A becomes D, and so on.
Frequency analysis: - Plot a bar chart of letter frequencies in the ciphertext.
Compare it to the known frequency distribution of letters in the English language.
Look for a shifted version of the standard distribution.
Exhaustive search: - Try every possible shift (k = 1 to 26).
Ciphertext's frequency analysis might not exactly match the expected distribution due to the limited size and specific content of the text.
The most frequent letter in the English language is E.
Identify the most frequent letter in the ciphertext. If the ciphertext is long enough, the most frequent letter is likely the encryption of 'E.'
Determine the shift required to align the most frequent ciphertext letter with E. This shift value is a potential key.
Also consider plain text letter A and possible shifts based on analysis to help narrow down possibilities. Analyzing other common letters (T, A, O, I, N, S, H, R) and their frequencies can help narrow down the possible shift values.
By comparing the two graphs, we can estimate the shift amount.
If G is the highest column and is considered the letter E, then the shift is 13.
If a shift of k = 13 yields plain English, the encryption is cracked. Shift the entire ciphertext by k = 13. If the result is readable, the code is cracked.
Example: The plain text obtained is "to be or not to be that is the question."
A different encryption method. Vigenère cipher is a method of encrypting alphabetic text by using a polyalphabetic substitution based on a keyword.
Key is used to shift the alphabet a certain number of places, wrapping around if needed. Each letter of the plaintext is shifted differently based on the keyword.
Write out the alphabet and number each letter (starting from 0).
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Keyword: "math"
Plain text: "make it happen"
Determine numerical values for each letter in the keyword:
M = 12
A = 0
T = 19
H = 7
Repeat the keyword's numerical sequence beneath the plain text:
make it happen
12 0 19 7 12 0 19 7 12 0 19
Shift each letter in the plain text by the corresponding number:
M shifted 12 places becomes Y.
A shifted 0 places remains A.
K shifted 19 places becomes D.
E shifted 7 places becomes L.
(and so on…)
Ciphertext: "YA DL UT AHB P XU"
Ciphertext: "I know it when i see it"
Keyword: "find"
Determine numerical values for each letter in the keyword:
F = 5
I = 8
N = 13
D = 3
Repeat the keyword's numerical sequence beneath the ciphertext:
I know it when i see it
5 8 13 3 5 8 13 3 5 8 13 3 5 8
Shift each letter in the ciphertext backwards by the corresponding number to recover the plain text.
I shifted back 5 places becomes D.
S shifted back 8 places becomes O.
Decoding continues
A system that has not been broken is considered strong. If a cipher resists all known cryptanalytic attacks, it's considered robust.
Ciphertext should appear random to standard statistical tests to be secure. Randomness ensures that there are no discernible patterns that can be exploited.
The number of possible keys is 26. Since there are only 26 possible shifts for English alphabet.
Encryption can be broken using an exhaustive search (trying every possible shift). Trying all possible keys to decrypt the ciphertext.
An alternative method involves using a random key permutation for each letter in the alphabet. Involves randomly assigning each letter of the alphabet to another letter.
Number of possible keys increases dramatically to 26 factorial (26!). Specifically, 26! = 4.03 \times 10^{26}, making brute-force attacks infeasible.
More difficult to attack via brute force but can still be broken using frequency statistics and facts of the English language. Despite the large key space, frequency analysis can still reveal information about the key.
Mary Queen of Scots' plot was unmasked via codebook analysis. Codebooks list codewords for phrases and names, making communication efficient but also vulnerable if the codebook is compromised.
Elizabeth I's agents decoded messages by guessing contents and testing hypotheses. Agents like Sir Francis Walsingham made educated guesses and looked for patterns.
Agents identified and arrested plotters. Key individuals involved in the conspiracy were apprehended.
Mary was convicted of high treason based on decoded letters and later executed. The decoded messages provided critical evidence against her.
Encryption requires a key to be used in conjunction with the cipher. Key management is critical in maintaining security.
Two possibilities: - Symmetric Key Cryptography: The same key is used for encryption and decryption. Both sender and receiver use the same key.
Asymmetric Key Cryptography: A key pair is used, with one key for encryption and a different key for decryption. One key is public, and one is private.
Used in ATM machines. ATM machines use symmetric encryption to protect PINs and transaction data.
Examples: - Data Encryption Standard (DES): 64-bit key. An early symmetric key algorithm.
Triple DES: 128-bit key. A more secure version of DES.
Advanced Encryption Standard (AES): supersedes DES, using 128-bit key (or larger). AES supports various key sizes (128, 192, 256 bits) offering different levels of security.
Block ciphers: plain text is divided into blocks, and each block is encrypted separately. Common block sizes are 64 bits (e.g., DES) or 128 bits (e.g., AES).
Encryption key is different from the decryption key. The encryption key is public, while the decryption key is private.
Requires a public key and a private key. The public key is used to encrypt messages, and the private key is used to decrypt them.
The private key must be kept secret. Compromising the private key compromises the entire system.
The public key can be shared with intended recipients. Anyone can encrypt a message using the public key, but only the holder of the private key can decrypt it.
Ensures messages are from their advertised source. By signing messages with a private key, recipients can verify the sender's identity using the corresponding public key.
Can be used to form digitally signed communications. Digital signatures provide integrity and non-repudiation.
Digital signing means the communication is verified to be from who it claims to be from and has not been altered. Provides assurance that the message hasn't been tampered with.
Hashing involves transforming data into a distilled form called a message digest (or hash). A hash function takes an input and produces a fixed-size string of characters.
Hashing is not reversible (one-way process). It's computationally infeasible to derive the original input from the hash value.
Typical algorithms: - Secure Hashing Algorithm (SHA), e.g., SHA-1. SHA-1 produces a 160-bit hash value.
Message Digest 5 (MD5). MD5 produces a 128-bit hash value.
Fixed length: the hash output is always the same length, regardless of the input data size. Ensures uniformity and predictability.
Unique to the data: different messages produce different hashes (with very high probability). Small changes in the input data result in significant changes in the hash value.
Impossible to recover the original message from the message digest. Hashing is designed to be a one-way function.
### Uses of Hashing
Creating digitally signed documents. Hash the document, then encrypt the hash with the private key.
Verifying downloaded code: - Calculate the hash of the downloaded software.
Compare it with the hash provided by the software supplier.
Different hashes indicate corrupted data. Ensures the integrity of the downloaded file.
Forensics investigation: - Calculate the MD5 hash of the evidential data and its bit-for-bit copy.
Ensure the hashes are identical.
Third parties can replicate the analysis by verifying the hash of the data they're working on. Maintains the chain of custody and ensures data integrity.
Using SHA-1, the probability of two different messages producing the same hash is 1 in 10^{48}, which is practically zero. Although