Cryptography and System Security - Hashes Message Digests and Digital Certificates
Hashes, Message Digests and Digital Certificates
Cryptographic Hash Functions
A hash function maps digital data of arbitrary size to digital data of a fixed size. The values returned are called hash values, hash codes, or simply hashes.
Where:
is a hash function.
is a message.
is the authentication tag (message digest).
It is considered practically impossible to invert a cryptographic hash function.
Data Integrity
Cryptographic hash functions ensure data integrity. For example, Bob can verify if .
Alice sends Bob the message and hash .
Properties of Cryptographic Hash Functions
Computationally infeasible to find data mapping to specific hash (one-way property).
Computationally infeasible to find two data to the same hash (collision-free property).
Properties of Hash Functions
Ideal cryptographic hash functions have these properties:
Easy to compute the hash value for any given message.
Infeasible to generate a message from its hash (one-way commitment).
Infeasible to modify a message without changing the hash.
Infeasible to find two different messages with the same hash.
These properties ensure that a malicious adversary cannot replace or modify the input data without changing its digest. If two strings have the same digest, one can be very confident that they are identical.
Requirements
Compression: The output length of is fixed and small, regardless of the size of input .
Efficiency: It must be easy to compute for any input . Computational effort grows with the length of , but should not grow too fast.
Security Properties
A cryptographic hash function must withstand known cryptanalytic attacks and have the following properties:
Pre-image resistance: Given a hash value , it should be difficult to find any message such that . This is related to a one-way function.
Second pre-image resistance (Weak collision resistance): Given and , it’s infeasible to find with , such that .
Strong Collision resistance: It should be difficult to find two different messages and , with , such that . Such a pair is called a cryptographic hash collision.
Hash Function Uses
Message Integrity Check (MIC): Send the hash of the message (digest). MIC always encrypts the message optionally.
Message Authentication Code (MAC): Send a keyed hash of the message. The message can be optionally encrypted.
Digital Signature (non-repudiation): Encrypt the hash with a private (signing) key and verify with a public (verification) key.
Message Authentication Code (MAC)
Message authentication is achieved using a message authentication code (MAC), also known as a keyed hash function. MACs are used between two parties that share a secret key to authenticate information exchanged between those parties.
A MAC function takes as input a secret key and a data block, producing a hash value (MAC) associated with the protected message. An attacker altering the message will be unable to alter the associated MAC value without knowing the secret key.
is a function of a variable-length message and a secret key , producing a fixed-size output secure against an opponent who does not know the secret key.
Hash Functions & Message Authentication
Symmetric Key - Unkeyed Hash
a) Message encrypted
b) Message unencrypted
Symmetric Key - Keyed Hash
c) Message unencrypted
d) Message encrypted
Digital Signatures
The operation of a digital signature is similar to that of the MAC. The hash value of a message is encrypted with a user’s private key. Anyone who knows the user’s public key can verify the integrity of the message associated with the digital signature. An attacker needs to know the private key to alter the message.
The hash code is encrypted using public-key encryption with the sender’s private key, providing authentication and a digital signature. Only the sender could have produced the encrypted hash code.
If confidentiality and a digital signature are desired, the message plus the private-key-encrypted hash code can be encrypted using a symmetric secret key.
Hash Functions & Digital Signatures - PKCS
(a) Source A sends message to Destination B with digital signature. The hash of message is encrypted with source A's private key () and sent along with the message. Destination B decrypts the encrypted hash with source A's public key () and compares it with the hash of the received message.
(b) Source A sends message encrypted with symmetric key along with the hash of message encrypted with source A's private key (). Destination B decrypts the symmetric key encrypted data using the symmetric key and decrypts the encrypted hash with source A's public key () and compares it with the hash of the received message.
Other Hash Function Uses
Information security applications:
Verifying the integrity of files or messages
Password verification: store the hash of the password, not the actual password
Proof-of-work
File or data identifier
For intrusion detection and virus detection: keep & check hash of files on system
Pseudorandom function (PRF):
Generate session keys
Produce key from password
Derive keys from master key cooperatively
Pseudorandom number generator (PRNG):
Vernam Cipher/OTP
S/Key, proof of “what you have” via messages
To create a one-way password file:
Store hash of password not actual password (e.g., Unix, Windows NT, etc.)
Salt to deter pre-computation attacks
*Rainbow tables (is a precomputed table for reversing cryptographic hash functions, usually for cracking password hashes. Tables are usually used in recovering a password (or credit card numbers, etc.) up to a certain length consisting of a limited set of characters.
For intrusion detection and virus detection:
Keep & check hash of files on system (e.g., Tripwire - Is a free software security and data integrity tool for monitoring and alerting on specific file change(s) on a range of systems.)
Applications
Information security applications:
Authentication: Digital signatures, Message authentication codes (MACs).
Ordinary hash functions: to index data in hash tables.
Fingerprinting: to detect duplicate data or uniquely identify files.
Checksums: to detect accidental data corruption.
Cryptographic hash values are sometimes called (digital) fingerprints, checksums, or just hash values
Hash Function Requirements
Requirement | Description |
|---|---|
Variable input size | can be applied to a block of data of any size. |
Fixed output size | produces a fixed-length output. |
Efficiency | is relatively easy to compute for any given , making both hardware and software implementations practical. |
Preimage resistant (one-way property) | For any given hash value , it is computationally infeasible to find such that . |
Second preimage resistant | For any given block , it is computationally infeasible to find with . |
Collision resistant | It is computationally infeasible to find any pair such that . |
Pseudorandomness | Output of meets standard tests for pseudorandomness |
Attacks on Hash Functions
Brute-force attacks: simplest method to gain access by trying various combinations of usernames and passwords.
Cryptanalysis: attacks exploit some property of the algorithm so faster than exhaustive search.
A preimage or second preimage attack: Find such that equals a given hash value.
Collision resistance: Find two messages & with the same hash so . Hence value determines strength of hash code against brute-force attacks. 128-bits inadequate, 160-bits suspect.
Birthday Attacks
The probability that all 23 people have different birthdays is
Therefore, the probability of at least two having the same birthday is
More generally, suppose we have objects, where is large. There are people, and each chooses an object. Then
Choosing , we find that if , then the probability is 50% that at least two people choose the same object.
If there are possibilities and we have a list of length , then there is a good chance of a match.
If we want to increase the chance of a match, we can make a list of length of a constant times .
Hash Function Cryptanalysis
Cryptanalytic attacks exploit some property of the algorithm, making them faster than exhaustive search. Hash functions use iterative structure to process messages in blocks (including length). Attacks focus on collisions in function .
Block Ciphers as Hash Functions
Can use block ciphers as hash functions. Using and zero-pad of the final block. Compute:
and use the final block as the hash value. The resulting hash is too small (64-bit), both due to direct birthday attack and “meet-in-the-middle” attack. Other variants are also susceptible to attack.
Block cipher key length
Pad Message to a multiple of
Break padded into blocks
Use blocks of as keys in the block cipher, iteratively encrypting the state value starting with constant , resulting in a hash value.
MD5: Message Digest
Designed by Ronald Rivest in 1991.
Latest in a series of MD2, MD4.
It takes an input of arbitrary length.
Produces a 128-bit message digest.
The input message is processed in 512-bit blocks.
MD5 has become a popular hashing tool through PHP.
The most widely used hash algorithm is MD5.
MD5 is not collision-resistant.
MD5 hash is a 32-digit hexadecimal number.
The MD5 algorithm is intended for digital signature applications, where a large file must be "compressed" securely before being encrypted with a private (secret) key under a public-key cryptosystem such as RSA.
Long messages need an integrity check before encryption.
MD5 Algorithm Steps
Append padding bits
Append length
Initialize MD buffer
Process messages in 512-bit (16 words of 32-bit) blocks
Output (message digest)
MD5 Implementation Steps
Step 1: Append padding bits
The input message is
/