hash-attack-techniques
SKILL: Hash Attack Techniques — Expert Cryptanalysis Playbook
AI LOAD INSTRUCTION: Expert hash attack techniques for CTF and security assessments. Covers length extension attacks, MD5/SHA1 collision generation, meet-in-the-middle hash attacks, HMAC timing side channels, birthday attacks, and proof-of-work solving. Base models often incorrectly apply length extension to HMAC or SHA-3, or fail to distinguish between identical-prefix and chosen-prefix collisions.
0. RELATED ROUTING
- rsa-attack-techniques when hash weaknesses affect RSA signature schemes
- symmetric-cipher-attacks when hash is used in key derivation
- classical-cipher-analysis when analyzing hash-like constructions in classical ciphers
Quick attack selection
| Scenario | Attack | Tool |
|---|---|---|
H(secret || msg) known, extend message |
Length extension | HashPump, hash_extender |
| Need two files with same MD5 | Identical-prefix collision | fastcoll |
| Need specific MD5 prefix match | Chosen-prefix collision | hashclash |
| Byte-by-byte HMAC comparison | Timing attack | Custom script |
| Find any collision | Birthday attack | O(2^(n/2)) |
| Proof of work: find hash with leading zeros | Brute force | hashcat, Python |
1. LENGTH EXTENSION ATTACK
1.1 Vulnerable vs Non-Vulnerable
| Hash | Vulnerable | Why |
|---|---|---|
| MD5 | Yes | Merkle-Damgard construction |
| SHA-1 | Yes | Merkle-Damgard construction |
| SHA-256 | Yes | Merkle-Damgard construction |
| SHA-512 | Yes | Merkle-Damgard construction |
| SHA-3 / Keccak | No | Sponge construction |
| HMAC-* | No | Double hashing prevents extension |
| SHA-256 truncated | No (if truncated) | Missing internal state bits |
| BLAKE2 | No | Different construction |
1.2 Attack Mechanism
Given: MAC = H(secret || original_message)
Known: original_message, len(secret), MAC value
Compute: H(secret || original_message || padding || extension)
WITHOUT knowing the secret!
How: The MAC value IS the internal hash state after processing
(secret || original_message || padding).
Initialize hash with this state, continue hashing extension.
1.3 Padding Calculation (MD5/SHA)
def md5_padding(message_len_bytes):
"""Calculate MD5/SHA padding for given message length."""
bit_len = message_len_bytes * 8
# Pad with 0x80 + zeros until length ≡ 56 (mod 64)
padding = b'\x80'
padding += b'\x00' * ((55 - message_len_bytes) % 64)
# Append original length as 64-bit little-endian (MD5)
# or big-endian (SHA)
padding += bit_len.to_bytes(8, 'little') # MD5
# padding += bit_len.to_bytes(8, 'big') # SHA
return padding
1.4 Tool Usage
# HashPump
hashpump -s "known_mac_hex" \
-d "original_data" \
-k 16 \ # secret length
-a "extension_data"
# Output: new_mac, new_data (original + padding + extension)
# hash_extender
hash_extender --data "original" \
--secret 16 \
--append "extension" \
--signature "known_mac_hex" \
--format md5
1.5 Python Implementation
import struct
def md5_extend(original_mac, original_data_len, secret_len, extension):
"""
Perform MD5 length extension attack.
original_mac: hex string of H(secret || original_data)
"""
# Parse MAC into MD5 internal state (4 × 32-bit words, little-endian)
h = struct.unpack('<4I', bytes.fromhex(original_mac))
# Calculate total length after padding
total_original = secret_len + original_data_len
padding = md5_padding(total_original)
forged_len = total_original + len(padding) + len(extension)
# Continue MD5 from saved state with extension
# (requires MD5 implementation that accepts initial state)
from hashlib import md5
# Most stdlib md5 doesn't expose state setting
# Use: hlextend library or custom MD5
import hlextend
sha = hlextend.new('md5')
new_hash = sha.extend(extension, original_data, secret_len,
original_mac)
new_data = sha.payload # includes original + padding + extension
return new_hash, new_data
2. MD5 COLLISION ATTACKS
2.1 Identical-Prefix Collision (fastcoll)
Two messages with same prefix but different content, producing identical MD5.
# Generate collision pair
fastcoll -p prefix_file -o collision1.bin collision2.bin
# Result: MD5(collision1.bin) == MD5(collision2.bin)
# Files differ in exactly 128 bytes (two MD5 blocks)
2.2 Chosen-Prefix Collision (hashclash)
Two messages with different chosen prefixes, appended with computed suffixes to collide.
# hashclash (Marc Stevens)
./hashclash prefix1.bin prefix2.bin
# Result: MD5(prefix1 || suffix1) == MD5(prefix2 || suffix2)
2.3 UniColl (Single-Block Near-Collision)
Produces two messages differing in a single byte within one MD5 block, with same hash.
Application: forge two PDF/PE files with same MD5
- File 1: benign content
- File 2: malicious content
- Same MD5 hash
2.4 Collision Applications
| Application | Technique | Impact |
|---|---|---|
| Certificate forgery | Chosen-prefix | Rogue CA certificate (proven in 2008) |
| Binary substitution | Identical-prefix + conditional | Two executables, same MD5, different behavior |
| PDF collision | UniColl | Two PDFs showing different content |
| Git commit collision | Chosen-prefix (SHAttered for SHA1) | Two commits with same hash |
| CTF: bypass MD5 check | fastcoll | Two different inputs accepted as same |
2.5 CTF MD5 Collision Tricks
// PHP: md5($_GET['a']) == md5($_GET['b']) && $_GET['a'] != $_GET['b']
// Method 1: Array trick (not real collision)
?a[]=1&b[]=2 // md5(array) returns NULL, NULL == NULL
// Method 2: Real collision (fastcoll output, URL-encoded binary)
?a=<collision1_urlencoded>&b=<collision2_urlencoded>
// Method 3: 0e magic hashes (loose comparison ==)
// md5("240610708") = "0e462097431906509019562988736854"
// md5("QNKCDZO") = "0e830400451993494058024219903391"
// PHP: "0e..." == "0e..." is TRUE (both evaluate to 0 as floats)
3. SHA-1 COLLISION
3.1 SHAttered Attack (2017)
First practical SHA-1 collision: two PDF files with same SHA-1.
- Complexity: ~2^63 SHA-1 computations
- Cost: ~$110K on GPU clusters (2017 prices)
- Tool: shattered.io provides the collision PDFs
3.2 SHA-1 Chosen-Prefix Collision (2020)
- Complexity: ~2^63.4 computations
- Practical for attacking PGP/GnuPG key servers
- Demonstrates SHA-1 is broken for collision resistance
3.3 Impact
SHA-1 should NOT be used for:
✗ Digital signatures
✗ Certificate fingerprints
✗ Git commit integrity (migration to SHA-256 in progress)
✗ Deduplication based on hash
SHA-1 is still OK for:
✓ HMAC-SHA1 (collision resistance not required)
✓ HKDF-SHA1 (PRF security suffices)
✓ Non-adversarial checksums
4. BIRTHDAY ATTACK
4.1 Generic Birthday Bound
For n-bit hash: expected collisions after ~2^(n/2) hashes
Hash Bits Birthday bound
MD5 128 2^64
SHA-1 160 2^80
SHA-256 256 2^128
CTF application: if hash is truncated to k bits,
collision in ~2^(k/2) attempts
4.2 Birthday Attack Implementation
import hashlib
import os
def birthday_attack(hash_func, output_bits, max_attempts=2**28):
"""Find collision for truncated hash."""
mask = (1 << output_bits) - 1
seen = {}
for _ in range(max_attempts):
msg = os.urandom(16)
h = int(hash_func(msg).hexdigest(), 16) & mask
if h in seen and seen[h] != msg:
return seen[h], msg # collision!
seen[h] = msg
return None
# Example: find collision for first 32 bits of SHA-256
result = birthday_attack(hashlib.sha256, 32)
5. HMAC TIMING ATTACK
5.1 Vulnerable Comparison
# VULNERABLE: early-exit string comparison
def verify_hmac(received, expected):
return received == expected # Python == compares left to right
# The comparison may short-circuit on first differing byte,
# leaking timing information
5.2 Attack Strategy
import requests
import time
def hmac_timing_attack(url, data, hmac_len=32):
"""Byte-by-byte HMAC recovery via timing."""
known = ""
for pos in range(hmac_len * 2): # hex chars
best_char = ""
best_time = 0
for c in "0123456789abcdef":
candidate = known + c + "0" * (hmac_len * 2 - len(known) - 1)
times = []
for _ in range(50): # multiple samples for accuracy
start = time.perf_counter_ns()
requests.get(url, params={**data, "mac": candidate})
elapsed = time.perf_counter_ns() - start
times.append(elapsed)
avg_time = sorted(times)[len(times)//2] # median
if avg_time > best_time:
best_time = avg_time
best_char = c
known += best_char
print(f"Position {pos}: {known}")
return known
5.3 Constant-Time Comparison (Defense)
import hmac
# SECURE: constant-time comparison
def verify_hmac_secure(received, expected):
return hmac.compare_digest(received, expected)
6. MEET-IN-THE-MIDDLE (HASH)
6.1 Concept
Split hash computation into two halves, precompute one, match against the other.
Hash computation: H = f(g(x₁), h(x₂))
Precompute: table[g(x₁)] = x₁ for all x₁ in space₁
Search: for each x₂ in space₂:
if h(x₂) in table:
found! (x₁, x₂)
Time: O(2^(n/2)) instead of O(2^n)
Space: O(2^(n/2))
7. HASH PROOF-OF-WORK
7.1 Common CTF PoW Formats
# Format 1: Find x such that SHA256(prefix + x) starts with N zero bits
import hashlib
def solve_pow_prefix(prefix, zero_bits):
target = '0' * (zero_bits // 4)
i = 0
while True:
candidate = prefix + str(i)
h = hashlib.sha256(candidate.encode()).hexdigest()
if h.startswith(target):
return str(i)
i += 1
# Format 2: Find x such that SHA256(x) ends with specific suffix
def solve_pow_suffix(suffix_hex, hash_func=hashlib.sha256):
i = 0
while True:
h = hash_func(str(i).encode()).hexdigest()
if h.endswith(suffix_hex):
return str(i)
i += 1
7.2 GPU-Accelerated PoW
# hashcat for SHA256 PoW
hashcat -a 3 -m 1400 --hex-charset \
"0000000000000000000000000000000000000000000000000000000000000000:prefix" \
"?a?a?a?a?a?a?a?a"
8. RAINBOW TABLES & SALTING
8.1 Rainbow Table Attack
Precomputed chain: password → hash → reduce → password₂ → hash₂ → ...
Lookup: given hash h, check if h appears in any chain
Time-memory tradeoff: less space than full table, more time than direct lookup
8.2 Salt Defeats Rainbow Tables
Without salt: H(password) — same password always produces same hash
With salt: H(salt || password) — different salt per user
Rainbow tables are password-specific, not (salt+password)-specific
Each unique salt requires a separate table → infeasible
8.3 Modern Password Hashing
| Algorithm | Salt | Iterations | Memory-Hard | Recommended |
|---|---|---|---|---|
| MD5 | No | 1 | No | Never |
| SHA-256 | No | 1 | No | Never for passwords |
| bcrypt | Yes | Configurable | No | Yes |
| scrypt | Yes | Configurable | Yes | Yes |
| Argon2 | Yes | Configurable | Yes | Best choice |
| PBKDF2 | Yes | Configurable | No | Acceptable |
9. DECISION TREE
Hash-related challenge — what's the scenario?
│
├─ Have H(secret || message), need to extend?
│ ├─ Hash is MD5/SHA1/SHA256/SHA512?
│ │ └─ Yes → Length extension attack
│ │ └─ Need: MAC value, original message, secret length
│ │ └─ Tool: HashPump or hash_extender
│ │
│ └─ Hash is SHA3/HMAC/BLAKE2?
│ └─ Length extension doesn't work
│ └─ Look for other vulnerabilities
│
├─ Need two inputs with same hash?
│ ├─ MD5?
│ │ ├─ Same prefix → fastcoll (seconds)
│ │ ├─ Different prefixes → hashclash (hours)
│ │ └─ CTF PHP loose comparison → 0e magic hashes
│ │
│ ├─ SHA-1?
│ │ └─ SHAttered (expensive, use precomputed if possible)
│ │
│ └─ SHA-256+?
│ └─ No practical collision attack
│ └─ Look for logic flaws instead
│
├─ Need to forge HMAC?
│ ├─ Timing side channel available?
│ │ └─ Byte-by-byte timing attack
│ │
│ ├─ Key is short/weak?
│ │ └─ Brute force key with hashcat
│ │
│ └─ No weakness?
│ └─ HMAC is secure — look elsewhere
│
├─ Hash is truncated (short output)?
│ └─ Birthday attack — collision in 2^(bits/2)
│
├─ Proof of work?
│ └─ Brute force with parallel computation
│ ├─ Python multiprocessing for < 28 bits
│ ├─ hashcat/GPU for > 28 bits
│ └─ Optimize: pre-increment string, avoid re-encoding
│
└─ Password hash cracking?
├─ No salt → rainbow tables (pre-computed)
├─ Known salt → hashcat / John the Ripper
└─ Memory-hard (Argon2/scrypt) → limited by memory, slow brute force
10. TOOLS
| Tool | Purpose | Usage |
|---|---|---|
| HashPump | Length extension attack | hashpump -s MAC -d data -k secret_len -a extension |
| hash_extender | Length extension (multiple algorithms) | hash_extender --data D --secret L --append E --sig MAC |
| fastcoll | MD5 identical-prefix collision | fastcoll -p prefix -o out1 out2 |
| hashclash | MD5 chosen-prefix collision | hashclash prefix1 prefix2 |
| hashcat | Password/hash cracking (GPU) | hashcat -m MODE -a ATTACK hash wordlist |
| John the Ripper | Password cracking (CPU/GPU) | john --wordlist=rockyou.txt hashes.txt |
| CyberChef | Quick hash computation and encoding | Web-based |