code-slimming
Code Slimming: Read-Only Refactor Opportunity Audit
Find behavior-preserving opportunities to make a codebase smaller, clearer, and less repetitive. This skill reports opportunities only. It does not edit code and does not write tests.
The goal is not fewer lines at any cost. The goal is lower maintenance burden with behavior, performance, readability, and validation made explicit.
When to use
- Finding opportunities to simplify a repository, package, module, or PR
- Finding behavior-preserving refactor opportunities without implementing them
- Auditing duplicated logic, classes, structs, helpers, types, schemas, handlers, or adapters
- Looking for safe centralization candidates before a cleanup/refactor PR
- Reviewing a bot or human cleanup PR that claims to reduce code size
- Ranking maintainability refactors by value, risk, and validation needs
- Handling duplicate code only when the main goal is smaller behavior-preserving structure, not general cleanliness or AI-slop detection
When NOT to use
- Bug-focused reviews, regressions, races, edge cases, or crashes - use code-review
- General cleanup, naming, comments, AI tells, dependency creep, overengineering, or duplicate-code-as-slop without an explicit slimming goal - use anti-slop
- Security vulnerabilities, secret scanning, auth flaws, or exploitability - use security-audit
- Writing, debugging, or adding validation tests for a slimming recommendation - use testing
- Broad quick merge checks - use full-review
- Comprehensive repo audits across all applicable dimensions - use deep-audit
- Direct implementation work - use the relevant language, framework, or domain skill
Routing boundaries
| User intent | Use |
|---|---|
| "Slim this codebase", "find safe deletions", "review LOC deletion" | code-slimming |
| "Clean this up", "does this look AI-written?", "overengineered/verbose" | anti-slop |
| "Review this", "find bugs", "sanity check", "will this break?" | code-review |
| "Write/add/debug tests for this refactor" | testing |
| "Run all checks", "full review", "audit this repo" | full-review or deep-audit |
| "Implement the slimming/refactor" | Relevant language/framework/domain skill |
Do not activate this skill for generic review, cleanup, or audit wording unless the user explicitly asks for slimming, deduplication, deletion, wrapper removal, or behavior-preserving size reduction.
AI Self-Check
Before returning a code-slimming audit, verify:
- Read-only boundary held: no source files were edited and no tests were written
- Behavior preserved: every recommendation names the behavior that must stay identical
- Value explained: every recommendation states why the slimmer shape is better
- Tradeoffs assessed: performance, coupling, readability, bundle size, allocation count, and test brittleness were considered where relevant
- Validation named: each
Do noworDo with testsitem lists concrete validation commands or test coverage needs - No abstraction theater: no vague "make this generic" or "create a base class" advice without the proposed shape
- Duplication judged in context: likely divergence, framework conventions, and explicitness were considered before recommending centralization
- Correctness and security routed: bugs go to code-review; vulnerabilities go to security-audit
- Routing lane held: generic cleanup, slop, correctness, security, test-writing, broad-review, and implementation work were routed instead of reported as code-slimming findings
Performance
- Start with changed files, shared modules, and repeated directory shapes before scanning the whole repo.
- Group repeated examples into one finding with representative paths.
- Prefer cheap structural searches before expensive test suites.
Best Practices
- Treat smaller code as a hypothesis, not a win.
- Prefer deleting wrappers over adding a new abstraction layer only after proving the wrapper has no boundary, policy, observability, compatibility, or lifecycle role.
- Prefer a small well-named helper over a framework-shaped base class.
- Keep domain-specific duplication when the variants are likely to diverge.
- Keep defensive duplication when checks intentionally repeat across trust boundaries, process boundaries, public APIs, persistence layers, or privileged operations. Do not remove a repeated guard just because an upstream layer appears to validate the same condition.
- Avoid centralizing across independently versioned modules, separately owned teams, protocol or API versions, tenant-specific behavior, or plugin/provider boundaries unless the shared contract is explicit and stable.
- Recommend tests before centralization when behavior differences are subtle or undercovered.
Workflow
Step 1: Determine scope
Pick the narrowest useful scope:
- PR or diff - default when uncommitted changes or a branch diff exists
- Specific path - use when the user names files or directories
- Whole repo - use when the user asks for a repo-wide slimming audit
For git repos, gather cheap preflight context before deciding:
- repo root and branch:
git rev-parse --show-toplevel,git branch --show-current - uncommitted files:
git diff --name-only,git diff --cached --name-only - branch base when available:
git merge-base HEAD @{upstream}; otherwise detect the default branch and usegit merge-base HEAD origin/<default-branch> - default branch when needed:
git symbolic-ref refs/remotes/origin/HEAD --short 2>/dev/nullor inspectgit remote show origin - changed files and size:
git diff --name-only <base>...HEAD,git diff --stat <base>...HEAD
Scope precedence:
- User-provided diff or path
- Uncommitted changes
- Current branch against upstream
- Current branch against the default branch
- Whole repo, or ask one concise question when interactive
If the scope is unclear and no diff exists, ask one concise question. In headless contexts, default to whole repo and state the assumption.
Step 2: Gather context
Read project instructions and manifests before judging code shape:
- instruction files:
AGENTS.md,CLAUDE.md,GEMINI.md,.cursor/rules,.windsurfrules - manifests:
package.json,pyproject.toml,go.mod,Cargo.toml,pom.xml,build.gradle,CMakeLists.txt,Makefile - test and check scripts
- existing shared modules, helper directories, framework conventions, and recent related commits
Note the language and framework patterns in the report header. A good centralization in Python may be a bad abstraction in Rust, Java, or C.
Step 3: Identify validation without over-running it
Running existing read-only validation commands is allowed unless the user forbids command execution or the commands have side effects. Separate validation into three passes:
- Early baseline validation - cheap checks that describe current repo or diff health:
- lint or format checks when fast and obvious
- type or compile checks when fast and local
- documented project validation commands
- Candidate coverage evidence - after finding a slimming candidate, inspect existing tests, callers, snapshots, fixtures, or examples that exercise the behavior to preserve.
- Implementation validation - commands or tests someone must run if they implement the recommendation.
Do not run expensive full suites before candidate discovery unless the user asked for PR validation or the project makes the command cheap and standard. If a check is missing, slow, flaky, noisy, unavailable, external-service-dependent, or exits zero while printing warnings, report that as a validation gap. Do not call an opportunity safe when validation is absent.
Passing current checks does not by itself make a proposed slimming safe. Do now requires
behavior-specific coverage evidence, or a trivial mechanical change whose invariant is directly
verifiable. Generic lint, type, and build commands alone are not enough for behavioral
deduplication or centralization.
For future work, identify the validation an implementation must pass. Do not write tests in this skill.
Step 4: Search for candidates
Use structural and textual searches to find:
- near-duplicate files, classes, structs, functions, methods, hooks, handlers, or components
- repeated type, interface, schema, DTO, record, enum, or data container shapes
- repeated request parsing, query construction, pagination, validation, mapping, serialization, or error handling
- wrappers with little behavior beyond forwarding to another object or function
- oversized
utils,helpers,common,shared, ormiscmodules - parallel provider, client, repository, service, or adapter implementations with the same skeleton
- dependencies or helpers duplicating standard library or framework features
- generated-looking copy-paste that survived human maintenance, after checking whether a generator, schema, template, or vendored source owns it
Useful search tactics:
- list changed files, then read full files plus sibling files in the same role
- search repeated function, class, type, schema, route, and option names across the repo
- compare same-role directories such as
providers/*,clients/*,services/*, andrepositories/* - find large generic modules named
utils,helpers,common,shared, ormisc - inspect nearby tests to see whether the behavior contract is already captured
Concrete discovery recipe:
- Build a candidate map from changed files, same-role siblings, repeated basenames, large generic modules, and repeated exported symbols, routes, schemas, DTOs, handlers, parsers, validators, mappers, and serializers.
- Use cheap searches before manual reading: compare same-role trees such as
providers/*,clients/*,services/*,repositories/*,handlers/*,routes/*, andadapters/*; search repeated declarations and one-line wrappers that only forward to another call. - For each candidate, read the full candidate files and at least one nearby caller or test before classifying.
Skip or de-prioritize generated files, vendored dependencies, lockfiles, snapshots, fixtures, minified bundles, protobuf/OpenAPI generated clients, and build artifacts unless the user scopes them. When duplication appears in generated output, recommend changing the generator, schema, or template, or exclude it from slimming findings. Do not recommend hand-editing generated output.
Read surrounding code before judging. Similar shape is not enough. The question is whether one shared behavior path would be clearer, safer, and easier to validate.
Step 5: Classify opportunities
Use these labels:
- Do now - small, obvious, low risk, and already covered by meaningful validation
- Do with tests - likely worthwhile, but needs focused tests before implementation
- Defer - valid but too broad, risky, or low-value for current churn
- Leave alone - duplication is clearer, faster, intentional, or likely to diverge
These labels are audit recommendations only. Even Do now does not authorize this skill to edit
code; it means the proposed change appears safe for a separate implementation pass.
Every classified item must cite:
- representative file paths and line ranges
- the repeated behavior or wrapper behavior observed
- the behavior invariant that must remain identical
- existing validation evidence or the missing validation gap
- why similar-looking cases are included, excluded, or left alone
Most findings are not merge blockers. Say so clearly.
For Do now, cite behavior-specific validation evidence: exact test files or cases that exercise
the preserved behavior, or explain why the change is purely local, mechanical, and directly
inspectable. Generic lint, type, and build commands alone are not enough for behavioral
deduplication or centralization. If you cannot cite behavior-specific evidence, classify as
Do with tests or Defer.
Step 6: Evaluate tradeoffs
For every recommendation, answer:
- What behavior must remain identical?
- What gets smaller: LOC, concept count, duplicated call sites, public API surface, dependency count, file count, or test surface?
- What might get worse: runtime performance, bundle size, allocation count, readability, coupling, test brittleness, or onboarding clarity?
- Why is the proposed centralization better than the current duplication?
- What validation would prove the change is safe?
- Are duplicated checks intentionally defensive at separate boundaries, and what fails closed if one layer is bypassed?
If a slimmer implementation affects hot paths, rendering loops, query batching, serialization,
startup, memory layout, build output, or binary size, require performance-sensitive validation
before classifying it as Do now.
Step 7: Report
It is acceptable and often correct to return zero high-value opportunities. Do not manufacture a
slimming recommendation to fill the report. Prefer a well-justified Leave alone finding over a
low-confidence abstraction.
Use this format:
## Code Slimming Audit: [scope]
Context:
- Languages/frameworks: [detected]
- Baseline validation run: [commands and results; implementation validation not run because this audit is read-only]
- Validation gaps: [missing, noisy, skipped, or unavailable checks]
### High-Value Opportunities
**Do with tests** `services/*/list-items.*` - Centralize repeated pagination and filter parsing.
Affected files: `services/users/list-items.*`, `services/projects/list-items.*`
Evidence: `services/users/list-items.ts:24-58`, `services/projects/list-items.ts:19-55`
Current duplication: both modules parse the same page, limit, sort, and filter parameters.
Refactor shape: extract a shared parser with endpoint-specific allowlists.
Behavior invariant: page and limit defaults, max-limit handling, sort allowlists, and error messages stay identical.
Call-site impact: 2 endpoint handlers, no public import path changes.
Why better: one behavior path for defaults and validation, with fewer divergent call sites.
Tradeoffs: one shared helper couples list endpoints to a common pagination contract.
Risk: medium
Validation needed: add boundary tests for page and limit values, then run lint/type/build/test commands.
### Removed-Code Safety Review
Include this section only when reviewing a diff or PR that removed code.
**Needs evidence** `[area]`
Removed behavior: [code path, wrapper, branch, fallback, type, validation, or dependency removed]
Replacement path: [what now handles it]
Behavior invariant: [what must still happen]
Evidence checked: [diff lines, call sites, tests, type checks]
Risk: [low/medium/high]
Validation needed: [specific command/test/case]
### Low-Value Or Risky Opportunities
**Leave alone** `integrations/*` - Duplication is likely to diverge per provider.
Why not: each provider already has different retry, auth, pagination, and error semantics.
### Summary
- High-value opportunities: 1
- Low-value or risky opportunities: 1
- Merge blockers: none from this audit lens
- Residual risk / skipped areas: [large dirs, generated files, expensive checks, external services]
- Net recommendation: [slim / defer / leave mostly unchanged], based on risk-adjusted maintenance value, not LOC delta
If no useful slimming opportunities are found, say so explicitly:
### High-Value Opportunities
None found within scope.
### Search Coverage
- Scope inspected: [diff/path/repo areas]
- Patterns checked: [wrappers, duplicate schemas, repeated parsers, adapters, utils]
- Files/directories skipped: [generated/vendor/tests/etc.]
- Validation checked: [commands/tests found or unavailable]
### Why no action is recommended
- Existing duplication appears intentional because: [...]
- Thin wrappers are retained because: [...]
- Shared abstraction would likely worsen: [...]
### Low-Value Or Risky Opportunities
[optional leave-alone observations]
### Summary
- High-value opportunities: 0
- Low-value or risky opportunities: N
- Merge blockers: none from this audit lens
- Residual risk: [what was not inspected]
- Net recommendation: leave mostly unchanged
Keep the report concise. Show the refactor shape, not a lecture.
Common Patterns
Repeated boundary parsing
Request parsing, CLI argument normalization, env var parsing, and config loading often duplicate defaulting and validation rules. Centralize only when the same boundary contract really applies.
Near-twin adapters
Provider/client/repository adapters often start identical and then diverge. Recommend centralization only when the shared part is stable and the provider-specific differences stay explicit.
Duplicate data shapes
Repeated DTOs, schemas, records, structs, or interfaces can be centralized when they represent the same contract. Keep separate shapes when they describe different lifecycle stages or trust boundaries. Do not merge inbound untrusted request shapes, internal/domain shapes, persistence entities, queue/event payloads, and outbound response shapes merely because fields overlap. Shared field lists are not shared contracts; centralize only the truly common validated subset, or keep explicit mappers.
Wrapper layers
Thin wrappers that only forward calls usually add concept count without value. Prefer deleting or inlining them unless they isolate an external dependency, provide a stable public contract, or make testing materially easier. Leave them alone when they enforce validation, auth/authorization, tenant isolation, retries, idempotency, transactions, caching, rate limits, logging, tracing, metrics, feature flags, compatibility shims, dependency inversion, or fault isolation.
Oversized helper modules
Large utils, helpers, common, shared, or misc modules are often junk drawers. Recommend
splitting by domain concern or moving helpers closer to their only caller.
Performance-sensitive slimming
Shorter code can be slower. Centralized generic code can add allocation, dynamic dispatch, reflection,
bundle weight, cache misses, or indirect calls. In hot paths, require measurement or classify as
Defer.
Output Contract
See skills/_shared/output-contract.md for the full contract.
- Skill name: CODE-SLIMMING
- Deliverable bucket:
audits - Mode: always-on. Every invocation emits the full contract -- boxed inline header, body summary inline plus per-finding detail in the deliverable file, boxed conclusion, conclusion table.
- Deliverable path:
docs/local/audits/code-slimming/<YYYY-MM-DD>-<slug>.md - Severity scale:
P0 | P1 | P2 | P3 | info(see shared contract).
Related Skills
- anti-slop - code quality audit for AI-like patterns, over-abstraction, noisy comments, hallucinated APIs, and test theater.
- code-review - correctness audit for bugs, regressions, races, edge cases, and broken contracts.
- testing - writes and debugs tests required before implementing a slimming recommendation.
- security-audit - reviews security-sensitive code where "defensive" duplication may be necessary.
- full-review - quick four-pass merge-safety audit; does not include code-slimming by default.
- deep-audit - comprehensive repo audit; includes code-slimming as a Wave 2 maintainability pass.
Rules
- Do not edit code. This skill reports opportunities only.
- Do not write tests. Name missing tests and route test implementation to testing.
- Do not chase LOC alone. Smaller code that is slower, more coupled, or harder to understand is not automatically better.
- Require a concrete refactor shape. Every recommendation must describe what would be extracted, deleted, moved, or centralized.
- Explain why. Every recommendation must state the maintenance benefit and the behavior that must remain unchanged.
- Validate before calling it safe. A
Do nowrecommendation needs behavior-specific validation evidence, or a purely local mechanical invariant that was directly inspected. - Respect language idioms. Generic helpers, inheritance, macros, templates, reflection, and dynamic dispatch have different costs across Python, JavaScript/TypeScript, Java, Rust, C/C++, shell, and infrastructure code.
- Keep bugs and vulnerabilities in their lanes. Route correctness findings to code-review and security findings to security-audit.