Hashing, MACs, and Attacks – Study Notes

Hashing, MACs, and Attacks – Study Notes

  • Topics covered: Hashing, vulnerability to MiTM, keyed hashing (MAC), replay attacks, and secure protocols (HTTPS/TLS, IPsec) with a two-phase pattern.

Hashing basics

  • Q1: What is hashing?

    • A hashing is a function that takes a number of data bytes and generates a fixed-size hash value (digest).
    • Example sizes:
    • ext{SHA-1}()
      ightarrow 20 ext{ bytes } \ (160 ext{ bits})
    • ext{SHA-256}()
      ightarrow 32 ext{ bytes } \ (256 ext{ bits})
  • Q2: Is hashing encryption?

    • No. Hashing is one-way hashing (non-reversible). You cannot reliably recover the original data from the hash.
    • Encryption flow (conceptual):
    • Key1 (data bytes) -- ciphertext (enciphered data bytes)
    • Decryption: ciphertext -- original data bytes (assuming: (a) ciphertext unchanged; (b) key2 is correct; (c) the decryption algorithm is correct)
  • Symmetric vs Asymmetric cryptography (Q2):

    • Symmetric cryptography: key1 = key2 (same key for encryption and decryption).
    • Asymmetric cryptography: key1 and key2 are inverse keys (public/private pair).
  • Q2a: How would we benefit from asymmetric crypto?

    • In a keypair, each participant has two keys: public key (shared) and private key (secret).
    • Public keys are known to others; private keys are known only to the owner.
    • Scenario: John’s keypair = (pubJ, privJ); Mary’s keypair = (pubM, privM).
    • Public key encryption (confidentiality):
    • Q2a.1: If Mary wants to send a secret message to John, she encrypts with John’s public key to produce ciphertext, and sends it to John. John decrypts with his private key to recover the message.
      • Ciphertext = E<em>extpub</em>J(extmessage)E<em>{ ext{pub}</em>J}( ext{message})
      • Plaintext = D<em>extpriv</em>J(extciphertext)D<em>{ ext{priv}</em>J}( ext{ciphertext})
    • Q2a.2: Would this provide confidentiality? Yes, because John’s private key is known only to John, so only John can decrypt.
  • Digital signatures (concept introduced in Page 1):

    • A signer uses her/his private key to encrypt data to produce a digital signature.
    • Receiver uses signer’s public key to verify the signature.
    • Digital signatures provide:
    • Data integrity
    • Origin integrity (authenticity of the signer)
    • Non-repudiation (signer cannot deny signing)

Hashing for data integrity and vulnerabilities

  • Q3: Main use cases of hashing

    • Primary use: data integrity verification.
    • Scenario:
    • User A saves a file f and stores its hash hash1 = hf(f).
    • User B retrieves the file, computes hash2 = hf(retrieved_f).
    • If hash2 == hash1, data is trusted; if not, data may be corrupted.
    • Underlying assumption: both users A and B use the same hash function hf.
  • Q4: Vulnerabilities of hashing

    • A Man-in-the-Middle (MiTM) attack can target keyless hashing.
  • Q5: How MiTM attacks against keyless hashing work

    • Attack idea: attacker changes stored data (dataNew) and generates a new hashNew = hf(dataNew).
    • The attacker swaps out the original data and hash with dataNew and hashNew.
    • When a user retrieves the file and verifies the hash, they may accept dataNew as authentic.
  • Q5a: Prerequisites for the MiTM attack to work

    • The attacker must have access to both the stored file and the stored hash.
  • Practical note: to simulate MiTM against keyless hashing, one could write a program that:

    • Reads the stored file and its hash, modifies the file bytes, computes a new hash for the modified data, and overwrites both the file and hash with the new values.
  • Q6: Keyed hashing (MAC) – how it works

    • A shared secret key K is used by both the data creator and the verifier.
    • Saving process (file saver):
    • Data bytes --extMACK(extdata)ext{MAC}_K( ext{data})→ MAC code (the MAC)
    • Store both the data bytes and the MAC into the system.
    • Retrieval process (file retriever):
    • Retrieve data bytes and MAC, and obtain the shared key K.
    • Compute extMACK(extdata)ext{MAC}_K( ext{data}) → MAC code 2
    • If MAC code 2 equals the stored MAC, data is trusted; otherwise, do not trust.
    • Notation: extMACK(m)ext{MAC}_K(m) where m is the message/data.
  • Q7: How keyed hashing mitigates MiTM against keyless hashing

    • If the attacker does not possess the shared key K, they cannot produce a valid MAC for the modified data, so tampering is detectable.
  • Q8: Replay attacks

    • Definition: attacker copies a message and replays it to the original destination (one or more times).
    • Consequence: potential Denial-of-Service (DoS) due to flooding with repeated messages.
  • Q8a: Defenses against replay attacks (anti-replay)

    • Make each message unique in the protocol.
    • Techniques: include a timestamp, a sequence number, or a nonce (a unique random number) in each message.
  • Q9: Are keyless hashing replay-prone?

    • Yes; an attacker can replay a copied data+hash pair to the receiver.
  • Q10: Are keyed hashing replay-prone?

    • Yes; a replay attack can be launched against any communication protocol, though MACs can help detect tampering, the replay of a previously valid message can still occur.
  • Real-world context: Secure network protocols commonly employ anti-replay protections within secure communications (e.g., HTTPS over TLS, IPsec).

  • HTTPS/TLS overview and IPsec (two-phase pattern)

    • TLS/HTTPS is used for secure HTTP traffic; IPsec is a secure protocol for the Internet Protocol layer.
    • Two-phase pattern:
    • Phase one: Handshake
      • Authentication: the client/server authenticate each other (e.g., via server certificate).
      • Key exchange: agreement on a shared session key (symmetric key) for efficient encryption/decryption.
      • Note: Symmetric keys are more economical than public-key operations for ongoing data exchange.
    • Phase two: Secure sessions
      • The session key established in Phase one is used to generate a MAC for messages to verify integrity.
      • The receiver uses the MAC to verify message integrity; if verification passes, the message is trusted.
    • Rationale: Public-key cryptography (asymmetric) is computationally expensive; symmetric cryptography is faster for bulk data.
  • Final note: In-class assignment on encryption/decryption

    • Encourages practice with constructing and verifying encryption/decryption workflows using the concepts above.
  • Summary of key ideas

    • Hashing vs encryption: one-way, fixed-length digests; encryption is reversible with correct keys.
    • Asymmetric crypto enables confidentiality and digital signatures; public keys enable others to encrypt and verify signatures, while private keys enable decryption and signing.
    • Hashing provides data integrity; MACs (keyed hashing) add authenticity and integrity with a shared secret key, mitigating certain tampering attacks.
    • Replay attacks exploit repeated transmission; anti-replay measures (timestamps, nonces, sequence numbers) help deter them.
    • TLS/HTTPS and IPsec use a two-phase approach: handshake to authenticate and derive a session key, then secure sessions with MACs and symmetric encryption.
  • Important numerical/formula references from the notes

    • Hash output sizes:
    • 20extbytes20 ext{ bytes} for extSHA1ext{SHA-1}, 160extbits160 ext{ bits}
    • 32extbytes32 ext{ bytes} for extSHA256ext{SHA-256}, 256extbits256 ext{ bits}
    • MAC notation: extMACK(m)ext{MAC}_K(m)
    • Phase two session key: KsK_s (shared symmetric session key)
    • Example verification: if extMAC<em>K(extdata)=extMAC</em>K(extdata)ext{MAC}<em>K( ext{data}) = ext{MAC}</em>K( ext{data})' then data is trusted; otherwise it is rejected.
  • Practical implications and connections

    • Real-world protocols rely on the combination of hashing/MAC and public-key operations to provide confidentiality, integrity, and authenticity.
    • Anti-replay is a fundamental defense in secure communications, ensuring that old messages cannot be replayed to produce undesired effects.
    • The choice between MACs and public-key signatures depends on the scenario: MACs for efficiency with shared keys; signatures for non-repudiation without shared secrets.