Detect Duplicate Files
SKILL.md
SKILL-014: Detect Duplicate Files
Overview
Scans the workspace for identical files (by content, not name) to detect redundancy, copy-paste errors, or accidental forks. Generates a report suggesting deduplication actions.
Trigger Phrases
find duplicatescheck for duplicate filesscan redundancy
Inputs
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
--workspace-path |
string | No | Current directory | Root to scan |
--min-size |
int | No | 0 | Minimum file size in bytes to check |
--exclude |
string[] | No | node_modules, .git, bin, obj |
Directories to ignore |
Outputs
1. DUPLICATE_REPORT.md
Summary of found duplicates:
# Duplicate File Report
**Total Duplicates:** 12
**Wasted Space:** 4.5 MB
## Group 1 (Hash: a1b2...)
- `src/utils/math.ts` (Original?)
- `src/legacy/math_copy.ts`
## Group 2 (Hash: c3d4...)
- `config/settings.json`
- `deploy/settings.prod.json`
Implementation
Script: find_duplicates.ps1
- recurses through directory (respecting excludes).
- Calculates SHA256 hash of every file.
- Groups by hash.
- Filters groups with count < 2.
- Generates Markdown report.
Use Cases
- Cleanup: Reducing repo size by removing accidental copies of large assets.
- Refactoring: Finding code that was copy-pasted instead of shared.
Weekly Installs
1
Repository
smithery/aiFirst Seen
9 days ago
Security Audits
Installed on
amp1
opencode1
kimi-cli1
codex1
github-copilot1
gemini-cli1