db-wal-recovery
SQLite WAL Recovery
This skill provides guidance for recovering data from SQLite Write-Ahead Log (WAL) files that appear corrupted, encrypted, or otherwise inaccessible.
Core Principle
Recovery means finding existing data, not generating new data. Never fabricate or guess data based on patterns. The goal is to locate and extract actual data from the source files.
Initial Assessment
Verify WAL Mode and File Presence
-
Check if the database is in WAL mode:
PRAGMA journal_mode; -
Identify all database-related files (typically
main.db,main.db-wal,main.db-shm) -
Critical: If a file appears in one listing method but not another, this is a significant clue requiring investigation, not dismissal
Document Initial Observations
Record all file listings from different tools/methods. Discrepancies between tools often indicate:
- Special characters in filenames
- Permission issues
- Different file encodings
- Hidden or mounted files
- Tool-specific behavior differences
Investigation Strategies
When Files Appear to Be Missing
If a WAL file is listed by one tool but inaccessible via another:
- Investigate the discrepancy thoroughly - do not dismiss it
- Check for special characters:
ls -la | cat -A - Examine permissions:
ls -la,stat <filename> - Try different access methods: direct path, glob patterns, hex representation
- Check if file is a symlink:
file <filename>,readlink <filename> - Examine parent directory permissions
- Consider if the file might be in a different mount or namespace
When Files Appear Corrupted or Encrypted
Step 1: Examine Raw File Structure
xxd <filename> | head -50
hexdump -C <filename> | head -50
- Valid SQLite WAL files start with specific magic bytes
- Look for recognizable patterns vs random-looking data
- Document what you observe before concluding encryption
Step 2: Systematic Decryption Attempts
If encryption is suspected:
- Single-byte XOR: Try common keys (0x00-0xFF)
- Multi-byte XOR: Try common patterns, repeating keys
- Look for key hints: Check other files in the directory for encryption keys
- Examine headers: WAL headers may contain encryption method clues
- Try known SQLite encryption libraries: SQLCipher, SEE, etc.
Step 3: Check for Compression
Data may be compressed rather than encrypted:
- Look for compression signatures (gzip:
1f 8b, zlib:78 9c, etc.) - Try decompression before assuming encryption
When Database Schema Exists but Data is Missing
- Query existing data to understand current state
- Examine database pages directly with hex tools
- Look for data in freelist pages
- Check if WAL contains uncommitted transactions
Verification Strategies
Before Concluding Data is Unrecoverable
- Investigated all tool discrepancies thoroughly
- Tried multiple access methods for inaccessible files
- Examined raw bytes of all relevant files
- Attempted systematic decryption with multiple methods
- Checked for compression
- Looked for encryption keys in related files
- Revisited initial observations for missed clues
Before Submitting Any Solution
- Solution contains only data actually recovered from files
- No data was fabricated based on patterns or guesses
- Can trace each piece of recovered data to its source
- If data appears to follow a pattern, verified pattern against actual file contents
Common Pitfalls to Avoid
1. Dismissing Tool Discrepancies
Wrong: "The LS tool shows the file but ls doesn't find it, so it doesn't exist."
Right: Investigate why different tools show different results - this often reveals the key to accessing the file.
2. Pattern-Based Data Fabrication
Wrong: "Records 1-5 are apple=100, banana=200... so records 6-10 must be fig=600, grape=700..."
Right: Only include data that can be traced to actual file contents. If data cannot be recovered, report that fact rather than guessing.
3. Reframing "Encrypted" as "Missing"
Wrong: "The WAL is 'encrypted' means the data is missing, so I'll generate likely values."
Right: "Encrypted" means there IS data that needs to be decrypted. Persist with decryption attempts.
4. Premature Abandonment
Wrong: Giving up after simple single-byte XOR doesn't work.
Right: Systematically try multiple decryption approaches before concluding data is unrecoverable.
5. Ignoring Initial Observations
Wrong: Getting stuck and trying new approaches without revisiting earlier clues.
Right: When stuck, return to initial observations - especially any discrepancies or anomalies noted early on.
Recovery Workflow Summary
- Document: Record all initial observations and file listings
- Verify: Confirm WAL mode and identify all database files
- Investigate: Thoroughly explore any discrepancies in file access
- Analyze: Examine raw file contents before concluding corruption/encryption
- Decrypt: Systematically attempt multiple decryption methods
- Validate: Ensure all recovered data traces to actual file contents
- Report: Clearly distinguish recovered data from unrecoverable data
When Recovery Fails
If data truly cannot be recovered after exhaustive attempts:
- Document all methods attempted
- Explain why each method failed
- Report what data IS available vs what remains unrecoverable
- Never substitute guessed data for unrecovered data