Skip to content
— CH. 1 · DEFINING CORE PROPERTIES —

Cryptographic hash function

~3 min read · Ch. 1 of 5
5 sections
  • A cryptographic hash function takes a string of any length and produces a fixed-length output. This process must withstand all known types of cryptanalytic attack to remain secure in theoretical cryptography. The first requirement is pre-image resistance, which means finding an input that matches a given hash value should be difficult. A second requirement is second pre-image resistance, where it remains hard to find a different input matching the same hash when one message is already known. The third pillar is collision resistance, making it infeasible to find two distinct messages with identical hashes. These properties ensure that malicious adversaries cannot replace or modify data without changing its digest. If two strings share the same digest, they are likely identical. Collision resistance implies second pre-image resistance but does not guarantee pre-image resistance on its own.

  • Ronald Rivest designed MD5 in 1991 to replace an earlier function called MD4. It was specified as RFC 1321 in 1992 and produced a 128-bit digest. SHA-0 emerged from the U.S. Government's Capstone project and appeared under FIPS PUB 180 in 1993 before being withdrawn by the NSA. The revised version, SHA-1, arrived in 1995 with a 160-bit output. NIST released SHA-2 in 2001, creating algorithms like SHA-256 and SHA-512 using the Merkle-Damgård structure. On the 5th of August 2015, NIST published SHA-3, which relies on the Keccak algorithm developed by Guido Bertoni, Joan Daemen, Michael Peeters, and Gilles Van Assche. BLAKE2 followed on the 21st of December 2012, created by Jean-Philippe Aumasson, Samuel Neves, Zooko Wilcox-O'Hearn, and Christian Winnerlein. Jack O'Connor and his team announced BLAKE3 on the 9th of January 2020, reducing rounds from ten to seven for higher parallelism.

  • Collisions against MD5 can be calculated within seconds, rendering it unsuitable for most cryptographic uses. In August 2004, researchers found collisions in several popular functions including MD5 and RIPEMD-128. Xiaoyun Wang, Dengguo Feng, Xuejia Lai, and Hongbo Yu demonstrated these weaknesses that year. Joux et al. produced a collision for full SHA-0 on the 12th of August 2004, taking about 80,000 CPU hours on a supercomputer with 256 Itanium 2 processors. An attack reported in February 2005 could find SHA-1 collisions in roughly 2^69 operations instead of the expected 2^80. Google announced a practical collision in SHA-1 in February 2017 using the shattered attack method. A successful practical attack broke MD5 used within certificates for Transport Layer Security in 2008. These failures called into question the security of stronger algorithms derived from weak foundations like SHA-1 and RIPEMD-160.

  • Verifying message integrity involves comparing hash digests before and after transmission to detect changes. Websites often publish MD5 or SHA-1 digests to allow verification of downloaded files retrieved via file sharing systems. Digital signature schemes require cryptographic hashes calculated over messages to create secure authentication methods. Password verification relies on storing only the hash digest rather than cleartext passwords to prevent massive breaches if files are compromised. Systems use key derivation functions like PBKDF2, scrypt, or Argon2 to slow brute-force attacks on stored password hashes. Proof-of-work systems use partial hash inversions to deter denial-of-service attacks and spam on networks. Bitcoin mining uses these mechanisms to unlock rewards by finding messages whose hash begins with specific zero bits. Source code management tools like Git and Mercurial use sha1sum values to uniquely identify content and directory trees.

  • Most classical hash functions including SHA-1 and MD5 utilize the Merkle-Damgård construction to process arbitrary-length inputs. This method breaks input data into equally sized blocks operating sequentially using a one-way compression function. The last block must be unambiguously length padded to ensure security within this structure. Narrow-pipe designs where output size equals internal state size cause inherent flaws like length-extension and multicollisions. Modern functions employ wide-pipe constructions with larger internal states ranging from tweaks to new sponge designs. Keccak, selected as SHA-3, uses a cryptographic sponge instead of block-cipher-like components found in earlier algorithms. BLAKE3 operates internally as a Merkle tree supporting higher degrees of parallelism than its predecessor BLAKE2. Hash functions based on block ciphers often resemble encryption modes but use large keys designed for hashing resistance to related-key attacks.

Common questions

What are the three main security requirements of a cryptographic hash function?

A cryptographic hash function must provide pre-image resistance, second pre-image resistance, and collision resistance. These properties ensure that malicious adversaries cannot replace or modify data without changing its digest.

When was MD5 designed by Ronald Rivest and what output size does it produce?

Ronald Rivest designed MD5 in 1991 to replace an earlier function called MD4. It produced a 128-bit digest and was specified as RFC 1321 in 1992.

Who developed SHA-3 and when did NIST publish it on the 5th of August 2015?

NIST published SHA-3 on the 5th of August 2015 which relies on the Keccak algorithm developed by Guido Bertoni, Joan Daemen, Michael Peeters, and Gilles Van Assche.

Why is MD5 considered unsuitable for most cryptographic uses after 2004?

Collisions against MD5 can be calculated within seconds rendering it unsuitable for most cryptographic uses. Researchers found collisions in several popular functions including MD5 and RIPEMD-128 in August 2004.

How do Bitcoin mining systems use hash functions to unlock rewards?

Bitcoin mining uses partial hash invocations to deter denial-of-service attacks and spam on networks. These mechanisms allow miners to find messages whose hash begins with specific zero bits to unlock rewards.