What is MD5?
MD5 (Message Digest Algorithm 5) is a hash function designed by cryptographer Ronald Rivest in 1991. It takes any input — a word, a file, an entire database — and produces a fixed 128-bit digest displayed as 32 hexadecimal characters.
Security status: Broken. Practical collision attacks against MD5 have existed since 2004. Do not use MD5 for digital signatures, certificates, password storage, or any security-sensitive purpose.
MD5 at a Glance
128 bits
Digest size
32 chars
Hex length
512 bits
Block size
Broken
Status
The Avalanche Effect
Change a single character and the entire digest changes completely — this is the avalanche effect, a hallmark of good hash design.
| Input | MD5 Digest (32 hex chars) |
|---|---|
| hello | 5d41402abc4b2a76b9719d911017c592 |
| hello world | 5eb63bbbe01eeed093cb22bb8f5acdc3 |
| Hello | 8b1a9953c4611296a827abf8c47804d7 |
Notice: changing h → H (one character) produces a completely unrelated 32-character digest. No relationship between the inputs is visible in the outputs.
How MD5 Works
MD5 processes input in 512-bit blocks through four rounds of mathematical operations.
1. Padding
Input padded so length ≡ 448 (mod 512). A 1-bit appended, then zeros, then the 64-bit original length.
2. Init Buffers
Four 32-bit variables A, B, C, D initialised to fixed constants.
3. Process Blocks
Each 512-bit block runs 64 operations: AND, OR, XOR, NOT, modular addition, left-bit rotations.
4. Output Digest
A, B, C, D concatenated → 128-bit output displayed as 32 hex characters.
The full specification is defined in RFC 1321.
MD5 Output Format
128 bits
Digest size
32 chars
Hex length
0–9 a–f
Character set
Yes
Deterministic
Why MD5 is Broken
Collision Attacks
A collision is when two different inputs produce the same digest. This should be computationally infeasible — for MD5 it became trivially achievable.
- 1991
MD5 published by Ronald Rivest
Designed as a secure replacement for MD4.
- 1996
First pseudo-collisions found
Den Boer & Bosselaers expose weaknesses in MD5's compression function.
- 2004
Practical full collision attack published
Wang et al. generate MD5 collisions in under an hour on a desktop PC.
- 2008
Rogue SSL certificates forged using MD5
Attackers create HTTPS certificates trusted by all major browsers.
- 2012
MD5 fully banned for TLS certificates
CA/Browser Forum prohibits MD5 in all certificate signatures.
A collision means an attacker can craft a malicious file that has the same MD5 hash as a trusted file. The victim verifies the hash — it matches — but is running malware.
Rainbow Table Attacks on Passwords
Because MD5 is extremely fast, a modern GPU can compute billions of MD5 hashes per second. Pre-computed lookup tables covering billions of common passwords are freely available online. A database of plain MD5 password hashes can be fully cracked within minutes after a breach.
MD5 was designed for speed — exactly the wrong property for password hashing. Password algorithms need to be deliberately slow and memory-hard. Use Argon2id, bcrypt, or scrypt instead.
MD5 vs Modern Alternatives
| Use Case | MD5 | Better Choice |
|---|---|---|
| Password storage | ❌ Never | ✅ Argon2id or bcrypt |
| File integrity check | ⚠️ OK (non-security) | ✅ SHA-256 |
| Digital signatures | ❌ Never | ✅ SHA-256 or SHA-512 |
| SSL/TLS certificates | ❌ Banned since 2012 | ✅ SHA-256 |
| Cache keys / dedup IDs | ✅ Fine | ✅ MD5 is acceptable here |
| Database sharding | ✅ Fine | ✅ MD5 is acceptable here |
When MD5 Is Still Acceptable
MD5 isn’t completely useless — it simply cannot be trusted for security. It remains appropriate where speed matters and collision-resistance doesn’t.
File Deduplication
Quickly identify identical files in storage systems. Speed matters, collision-resistance doesn't.
Cache Keys
Hash a URL or query into a short cache key. Accidental collision risk is negligible.
Legacy Systems
Existing pipelines where re-engineering the hash infrastructure is impractical or costly.
DB Sharding
Distribute rows across partitions via hash(id) % num_shards — fast and uniform.
Do and Don’t
- Use SHA-256 for file integrity verification and checksums
- Use Argon2id or bcrypt for all password hashing
- Use MD5 freely for non-security purposes: cache keys, dedup IDs, sharding
- Verify downloaded files against SHA-256 checksums published by the author
- Never store passwords as plain MD5 hashes
- Never use MD5 for digital signatures or code signing
- Never use MD5 for SSL/TLS certificates
- Never assume MD5 equality proves files are identical — collisions are practical
Computing MD5 Hashes
echo -n "hello" | md5sum
# 5d41402abc4b2a76b9719d911017c592
md5sum filename.txt # file checksum Or use the Hash Generator for instant browser-based MD5 computation — no installation needed, nothing leaves your device.
Key Takeaways
- MD5 produces a 32-character hexadecimal digest from any input (128 bits)
- It is one-way — digests cannot be mathematically reversed
- It is cryptographically broken — practical collision attacks exist since 2004
- It is dangerously fast for password storage — billions of guesses per second on modern hardware
- Use SHA-256 for file integrity; Argon2id or bcrypt for passwords
- MD5 is still fine for non-security uses: cache keys, deduplication, sharding