Lecture Notes on Digital Signature Schemes and Cryptographic Hash Functions
Problems with Signature Schemes
- Basic Model:
- Input file divided into blocks. Each block is a plaintext element P. The file is a sequence of elements x<em>1,x</em>2,…,xn.
- Each x<em>i is encrypted separately to create y</em>i.
- The same model is used for signature schemes, leading to problems.
- Problems
- Length of the signature.
- RSA Signature Scheme
- Size of plaintext space is the same as the size of the signature space (both are Zn).
- If one plaintext element is 512 bits, the signature is also 512 bits.
- If the input file is 10 megabytes, the signature is also 10 megabytes.
- The signed file becomes 20 megabytes.
- Elgamal Signature Scheme
- Plaintext space is Z<em>p<em>, but the signature space is Z</em>p</em>×Zp−1 .
- The length of the signature is approximately two times the size of the message.
- If the file is 10 megabytes, the signature is 20 megabytes, and the signed file becomes 30 megabytes.
- This requires a lot of bandwidth to transmit the file, which is inefficient.
- Amount of computation.
- The time required to create the signature is too long.
- Public key cryptosystems are slow.
- Hybrid encryption is used to solve this problem in encryption, but signature creation still relies on public key cryptosystems.
- The problem is exacerbated because a signature must be created for each block of the plaintext.
- Cut and Paste Attack.
- If each plaintext block is signed separately, an attacker can reassemble the components to create a different message.
- The attacker can move signed components around.
- The signature remains verifiable on each component, so the sender cannot deny the altered message.
- Example of Cut and Paste Attack
- A note saying, "I owe you $100 and not $1,000,000 as you said."
- The attacker has the signature on "$100", and the signature on "$1,000,000."
- The attacker can rearrange the message to read, "I owe you $1,000,000 and not $100 as you said."
- Solution: Hash Functions
- Using hash functions solves all three problems.
Cryptographic Hash Functions
- Definition
- A hash function is a function that maps an arbitrary-length message to a fixed-size output.
- Denoted as h.
- The co-domain is typically a bit string of a specific length, e.g., 160 bits as in the Digital Signature Standard.
- h:0,1∗→0,1160, where 0,1∗ represents an arbitrary string of arbitrary length.
- The New Model
- Instead of signing each plaintext element separately, the entire message is hashed first.
- The output of the hash function is called the message digest (z).
- The message digest is then signed, and the signature is attached to the message.
- Alice creates her signature on the message digest z and attaches the signature y to the message x.
- The pair x, y is transmitted over the channel
- Signing and Verification Process
- Alice takes a long message $x$ and computes z=h(x).
- Alice signs z to get y=sig(z).
- Alice sends (x,y).
- Bob receives (x,y).
- Bob computes z′=h(x).
- Bob verifies the signature y on z′ using Alice's public key.
- verify(pkA,z′,y)==True
- Bob computes z=h(x) because the hash function is public.
- Bob checks if y is a valid signature of Alice on the message digest z.
- Benefits of Using Hash Functions
- Solves the problem of long signature lengths.
- If a file is 10 megabytes long, hashing compresses it to a 160-bit message digest.
- Using RSA, the signature is 160 bits, so the total file size is 10 megabytes + 160 bits.
- Using Elgamal, the signature would be 320 bits (twice the message digest size), resulting in 10 megabytes + 320 bits.
- The overhead is independent of the size of the original file.
- Reduces computation time.
- The signing algorithm is used only once on the message digest instead of on each plaintext block.
- Prevents cut and paste attacks.
- The entire message is signed holistically.
- Attacks and Properties of Hash Functions
- Hash functions must be carefully designed to avoid attacks.
- A good cryptographic hash function should have specific properties to ensure security.
Security Properties of Cryptographic Hash Functions
- Infinite Domain to Finite Range
- Hash functions map an infinite domain and codomain.
- Because the domain is larger than the co-domain, collisions are unavoidable.
- First Attack: Second Preimage Problem
- Goal: Given a hash function $h$ and a message x<em>1, find another message x</em>2!=x<em>1 such that h(x</em>1)=h(x2).
- Attack Scenario:
- Alice sends (x<em>1,y</em>1), where y<em>1=sig(h(x</em>1)).
- Oscar observes (x<em>1,y</em>1).
- Oscar computes x<em>2 such that h(x</em>2)=h(x1).
- Oscar sends (x<em>2,y</em>1) to Bob, posing as Alice.
- Bob verifies the signature, and it appears valid because h(x<em>2)=h(x</em>1).
- Second Preimage Resistance
- Definition: A hash function is second preimage resistant if, given h and x<em>1, it is computationally infeasible to find x</em>2!=x<em>1 such that h(x</em>1)=h(x2).
- Second Attack: Collision Problem
- Goal: Find two distinct messages x<em>1 and x</em>2 such that h(x<em>1)=h(x</em>2).
- Attack Scenario:
- Oscar finds x<em>1 and x</em>2 such that h(x<em>1)=h(x</em>2)=z.
- Oscar convinces Alice to sign x<em>1, resulting in (x</em>1,y), where y=sig(h(x1)).
- Oscar sends (x2,y) to Bob.
- Bob verifies the signature, and it appears valid because h(x<em>2)=h(x</em>1).
- Collision Resistance:
- Definition: A hash function is collision-resistant if it is computationally infeasible to find distinct messages x<em>1 and x</em>2 such that h(x<em>1)=h(x</em>2).
- Third Attack: Preimage Problem (One-Way Hash Function)
- Exploits the ability to sign random messages in signature schemes.
- Attack Scenario:
- Oscar computes a signature y and constructs a value z such that verify(z,y)==True.
- Oscar finds a message x such that h(x)=z.
- Oscar sends (x,y) to Bob.
- Bob verifies the signature, and it appears valid because h(x)=z.
- Preimage Resistance (One-Way Hash Function):
- Definition: A hash function is preimage resistant (one-way) if, given h and z, it is computationally infeasible to find x such that h(x)=z.
- Computing h(x) should be easy, but computing x given h(x) should be hard.
Implications and Relationships:
- Collision resistance is the most important property. If a hash function is collision-resistant, it is also one-way and second-preimage resistant.
- If a hash function is not second preimage resistant, it is not collision resistant.
- If it is easy to find a secondary image, then you have found a collision.
Hash Function Constructions and Standards
- Provably Secure Hash Functions
- In 1991, David Chaum, Juan Hastad, and Bridget Pfitzmann created a provably secure hash function.
- Assuming the discrete log problem is difficult, their hash function is collision-resistant.
- The construction is theoretical and not used in practice.
- Merkle-Damgård Construction
- Uses a compression function to compress a long bit string to a short string.
- Iterates the compression function to create a hash function from an infinite domain.
MD Series - Ron Rivest created a sequence of algorithms: MD0, MD1, MD2, MD3, MD4, and MD5.
- MD4 (1990) and MD5 (1992) are real hash functions.
- Secure Hash Algorithm (SHA) Series:
- NIST created the Secure Hash Algorithm (SHA) standard in 1993.
- SHA-0:
- Based on MD4.
- Vulnerabilities were found in MD4 and MD5.
- SHA-1:
- Collisions were found in 2017.
- SHA-2:
- Multiple sizes: 224, 256, 384, and 512 bits.
- Not broken, but uses the same principles as SHA-1.
- SHA-3:
- Became a standard in 2015.
- Based on the sponge construction.
- Absorbing and squeezing phases.