NYC
skills/letta-ai/skills/git-leak-recovery

git-leak-recovery

SKILL.md

Git Leak Recovery

Overview

This skill guides the process of recovering sensitive data (secrets, credentials, API keys) that have been removed from Git history through history-rewriting operations, extracting the data, and then securely cleaning the repository to ensure complete removal.

Key Concepts

Git Object Persistence

When commits are "removed" via operations like git reset, git rebase, or git commit --amend, the underlying Git objects are not immediately deleted. They become "unreachable" but persist in the repository until garbage collection occurs. This behavior enables recovery but also means secrets remain accessible until explicit cleanup.

Common Hiding Places for Secrets

When searching for removed secrets, check these locations in order of likelihood:

  1. Reflog - Most common location for rewritten history (git reflog)
  2. Dangling commits - Commits with no branch reference (git fsck --unreachable)
  3. Stashes - Often overlooked (git stash list)
  4. Other branches - May contain the original commits
  5. Tags - May reference old commits
  6. Git notes - Annotations attached to commits

Workflow

Phase 1: Reconnaissance

Before attempting recovery, gather information about the repository state:

# View current commit history
git log --all --oneline

# Check reflog for all references
git reflog show --all

# Find unreachable objects
git fsck --unreachable

# List stashes
git stash list

# List all branches
git branch -a

# List tags
git tag -l

Phase 2: Recovery

Once a target commit is identified:

# View commit contents without checking out
git show <commit-hash>

# View specific file from commit
git show <commit-hash>:<path/to/file>

# Extract file to working directory
git show <commit-hash>:<path/to/file> > recovered_file.txt

Phase 3: Cleanup

To completely remove secrets from the repository, perform cleanup in this specific order:

# Step 1: Expire all reflog entries
git reflog expire --expire=now --all

# Step 2: Run aggressive garbage collection
git gc --prune=now --aggressive

The order matters: reflog must be expired first, otherwise GC will not remove the objects since they are still referenced.

Phase 4: Verification

Verify cleanup was successful using multiple approaches:

# Attempt to access the old commit (should fail)
git show <old-commit-hash>

# Search for secret patterns in repository
grep -r "secret_pattern" . .git 2>/dev/null

# Check for unreachable objects
git fsck --unreachable

# Count loose objects (should decrease after GC)
find .git/objects -type f | wc -l

# Verify working tree is clean
git status

Verification Strategies

Confirming Recovery Success

  • Verify the recovered data matches expected format/content
  • Write recovered data to the designated output location
  • Confirm existing commits and history remain intact after recovery

Confirming Cleanup Success

  • Old commit hash should return error when accessed via git show
  • grep -r for secret patterns should return no matches
  • git fsck --unreachable should show no objects containing the secret
  • Compare object count before and after GC to confirm removal

Common Pitfalls

Investigation Pitfalls

  1. Only checking recent history - Use git reflog show --all not just git reflog to see all references
  2. Forgetting stashes - Stashes are a common place for accidentally stored secrets
  3. Missing other branches - Always check all branches with git branch -a

Cleanup Pitfalls

  1. Wrong order of operations - Always expire reflog before running GC
  2. Missing the --all flag - git reflog expire --expire=now without --all only affects HEAD
  3. Using --prune without =now - Default prune time is 2 weeks, use --prune=now for immediate effect
  4. Not using --aggressive - Standard GC may not remove all unreachable objects

Verification Pitfalls

  1. Only checking working directory - Secrets in .git directory require explicit checks
  2. Not verifying object removal - Always confirm the commit hash is inaccessible
  3. Incomplete grep patterns - Search for multiple variations of the secret pattern

Decision Tree

Is the task about recovering lost data from Git?
├── Yes → Check reflog first (git reflog show --all)
│   ├── Found in reflog → Use git show <hash> to view/extract
│   └── Not in reflog → Check fsck, stashes, branches, tags
└── Is the task about cleaning up secrets from Git?
    ├── Yes → Follow cleanup sequence:
    │   1. git reflog expire --expire=now --all
    │   2. git gc --prune=now --aggressive
    │   3. Verify with multiple methods
    └── Both recovery AND cleanup needed?
        → Complete recovery first, verify data saved,
          then proceed with cleanup

Important Considerations

  • Backup first: Before any cleanup operations, ensure recovered data is saved outside the repository
  • Remote repositories: This cleanup only affects the local repository; if secrets were pushed to a remote, additional steps are needed
  • Cloned copies: Any cloned copies of the repository may still contain the secrets
  • Credential rotation: After recovering exposed secrets, rotate them immediately regardless of cleanup success
Weekly Installs
20
Repository
letta-ai/skills
First Seen
Jan 19, 2026
Installed on
claude-code14
gemini-cli13
codex13
opencode13
antigravity12
cursor10