eng-code-cleanup
Code Cleanup
Structural cleanup, code-level cleanup, and refactoring — without breaking functionality.
Golden Rules
- Preserve behavior — Every change must produce the same observable behavior. If you can't verify this, don't make the change.
- Small incremental steps — One change at a time. Commit between steps. Never combine a refactor with a feature change.
- Check existing conventions first — Before changing anything, read the codebase's existing coding guidelines, linting config, naming patterns, and file structure. Match them.
- Test after each change — Run the test suite after every modification. If tests break, revert and try a smaller step.
- Rollback awareness — Commit before starting. Note the hash. If a change chain gets too complex,
git reset --hard <hash>and try a different approach.
Triage
Determine scope before starting. These parts can be used independently or combined.
| User intent | Part to use |
|---|---|
| "Reorganize files", "remove dead code", "clean up repo structure" | Part 1: Structural |
| "Remove AI slop", "clean up PR", "fix code smells" | Part 2: Code-Level |
| "Refactor this", "extract this", "redesign this module" | Part 3: Refactoring |
| "Clean up the codebase" (broad) | Part 1 → Part 2 → Part 3 in order |
Part 1: Structural Cleanup
Reorganize files, remove junk, and clean up project structure.
Phase 1: Analysis
Before touching anything, understand the codebase.
-
Map the directory structure
find . -type f -name "*.py" -o -name "*.js" -o -name "*.ts" -o -name "*.jsx" -o -name "*.tsx" | head -100 tree -L 3 -I 'node_modules|.git|__pycache__|.next|dist|build|venv|.venv' || find . -maxdepth 3 -type d | grep -v node_modules | grep -v .git -
Identify entry points and config files
- package.json, pyproject.toml, requirements.txt, Cargo.toml
- Main entry files (index., main., app., server.)
- Config files (.config.js, .env, tsconfig.json)
-
Run
scripts/analyze_codebase.pyto generate dependency report -
Ask clarifying questions if needed
- What is the primary tech stack?
- Are there specific directories to preserve?
- Any files that look dead but are actually used?
Phase 2: Identify Cleanup Targets
Safe to remove without verification:
- Empty directories
- .DS_Store, Thumbs.db, desktop.ini
- Duplicate package lock files (keep one)
- pycache, .pyc, .pyo files
- node_modules/.cache
- Coverage reports, test artifacts
- Editor configs if inconsistent (.idea, .vscode with personal settings)
- Backup files (*.bak, *.backup, *~, *.swp)
- Log files (*.log)
- Compiled outputs if source exists
Requires dependency check before removal:
- Unused source files (verify no imports)
- Orphan test files (verify not in test config)
- Unused assets/images
- Old migration files (check if applied)
- Commented-out code blocks
- Unused dependencies in package.json/requirements.txt
Never remove without explicit permission:
- Config files (might be environment-specific)
- Database files or migrations
- CI/CD configs
- License files
- README or docs
Phase 3: Reorganization Patterns
Only reorganize when explicitly asked. Follow the project's existing structure. If no structure exists, suggest one based on the stack and get confirmation before moving files.
File naming conventions — detect existing patterns first. Common defaults:
- Components: PascalCase (Button.tsx)
- Utilities: camelCase (formatDate.ts)
- Constants: UPPER_SNAKE_CASE or kebab-case file
- Tests: *.test.ts, .spec.ts, or test_.py
When to consolidate: fewer than 3 files with no growth path, multiple directories with same purpose (utils, helpers, lib), deeply nested single-file directories.
When to split: more than 15-20 files in one directory, mixed concerns, files with different lifecycle.
Phase 4: Execute Changes
-
Create backup commit before changes
git add -A && git commit -m "Pre-cleanup snapshot" 2>/dev/null || echo "No git or nothing to commit" -
Remove safe-to-delete files first
find . -name ".DS_Store" -delete 2>/dev/null find . -name "*.pyc" -delete 2>/dev/null find . -name "__pycache__" -type d -exec rm -rf {} + 2>/dev/null find . -type d -empty -delete 2>/dev/null -
Reorganize file structure (if applicable)
- Move files in batches by category
- Update imports after each batch
- Run tests or type checks between batches
-
Update import paths
- Grep for remaining references to old paths
- Verify build still works after import updates
-
Remove dead code
- Check for unused functions/variables with the analyzer script
- Remove commented-out code older than current work
Phase 5: Validation
Run whichever checks exist in the project:
# Tests
npm test 2>/dev/null || bun test 2>/dev/null || pytest 2>/dev/null || echo "No test runner found"
# Type checking
npx tsc --noEmit 2>/dev/null || echo "TypeScript check skipped"
# Linting
npm run lint 2>/dev/null || npx eslint . 2>/dev/null || echo "No linter found"
# Build
npm run build 2>/dev/null || bun run build 2>/dev/null || echo "Build check skipped"
List manual verification needed for features that lack test coverage.
Part 2: Code-Level Cleanup
Remove AI slop, fix code smells, and improve code quality at the file level.
Workflow
- Determine scope:
- PR cleanup →
git diff main --name-onlyto get changed files - Specific files → user-specified targets
- Whole codebase → scan all source files
- PR cleanup →
- For each file, read the surrounding code to understand existing style
- Make edits to remove identified issues
- Report a 1-3 sentence summary of changes
AI Slop Patterns
Comments to remove:
- Obvious/redundant comments explaining what code clearly does
- Comments that don't match the commenting style elsewhere in the file
- Section divider comments when not used elsewhere
Defensive code to remove:
- Try/catch blocks around code that doesn't throw or is already in a trusted path
- Null/undefined checks when callers guarantee valid input
- Type guards that duplicate earlier validation
- Redundant error handling when parent functions already handle it
Type issues to fix:
- Casts to
anythat bypass TypeScript's type system - Type assertions that hide real type mismatches
- Overly broad generic types when specific types exist
Style inconsistencies:
- Naming conventions that differ from the file
- Spacing/formatting patterns that differ from the file
- Import organization that differs from the file
Code Smells — When to Act
Don't fix smells for their own sake. Fix them when the code is actively being worked on and the smell makes the change harder.
| Smell | Act when... | Leave alone when... |
|---|---|---|
| Long method (>30 lines) | You need to modify part of it and can't find what to change | It's a straightforward sequential pipeline |
| Duplicated code | 3+ exact copies exist and one needs a change | 2 copies with different evolution paths |
| Long parameter list (>4 params) | The function is called from many places | It's an internal helper called once |
| Magic numbers | The value's meaning isn't obvious from context | It's a well-known constant (0, 1, -1, 100) |
| Nested conditionals (3+ levels) | You need to add another branch | The nesting maps to domain logic clearly |
| Dead code | Always remove. Version control has the history. | — |
| Primitive obsession | Invalid values cause bugs (e.g., negative userId) | The domain is simple and won't grow |
Part 3: Refactoring
Changing internal structure without changing external behavior.
When NOT to Refactor
- No test coverage — You can't verify behavior is preserved. Write tests first.
- Tight deadline — Ship first, refactor later.
- Code that won't change again — If nobody will read or modify it, the investment doesn't pay off.
- During a feature change — Separate commits. Always.
Phased Execution Order
When refactoring touches multiple files, follow this order to minimize breakage:
- Types/interfaces first — Update or create types that define the new structure
- Implementation — Refactor the actual logic to match new types
- Tests — Update tests to match new structure, verify they pass
- Cleanup — Remove old code, dead imports, unused types
Dependency Tracking
For multi-file refactors, map dependencies before starting:
- Which files does this change affect?
- What blocks what? (e.g., "updating the API response type blocks the frontend component")
- What can be done in parallel?
Pattern Refactoring
Apply design patterns only when the code has a concrete problem — not prophylactically.
Strategy pattern — When a conditional selects between 3+ distinct behaviors, each more than a few lines, and new behaviors are likely to be added.
Chain of Responsibility — When multiple checks run in sequence, each may short-circuit, and new checks are frequently added.
Extract Method — When a block has a clear single purpose and the method name would explain intent better than a comment. The most common and safest refactor.
Common Pitfalls
- Moving files breaks dynamic imports (check for
require()with variables,import()) - Barrel files (index.ts re-exports) can hide dependency issues
- CSS/SCSS imports may use relative paths
- Asset paths in code may be hardcoded
- Environment-specific configs might reference paths
- Removing "unused" code that's actually used via reflection, dynamic imports, or string-based lookups
- Renaming a widely-used symbol without IDE support leaves broken references