workleap-skill-optimizer
Skill Optimizer
Reduce skill token cost without losing coverage. Every token in SKILL.md body is paid per conversation — references/ files are loaded on-demand.
Optimization Workflow
Phase 1: Analyze
Measure the current skill before changing anything.
- Count SKILL.md body lines (exclude frontmatter) and estimate tokens (~4.5 tokens/line for mixed code/prose).
- Count description characters.
- List every
references/file with line counts. - Identify duplication: for each body section (at any heading level), check if the same concept or procedure is also covered in a reference file. Count those body lines and divide by total body lines for the overlap percentage.
- List nouns from the description that appear verbatim in a sibling skill's description — these need domain-qualifying in Phase 2.
If the skill has no references/ directory, optimization may require creating reference files first. See the playbook for guidance on this edge case.
Output a table:
| Metric | Current |
|---------------------|---------|
| Description chars | ??? |
| Body lines | ??? |
| Body tokens (est.) | ??? |
| Duplication % | ??? |
| Reference files | ??? |
Phase 2: Plan
Decide what stays in the body, what moves to references, and what gets compressed.
Body retention criteria — keep a section in the body ONLY if it meets at least one:
- Complex multi-step pattern requiring coordination across multiple sections or files
- Non-obvious logic, parameters, or decision rules that agents frequently get wrong without inline guidance
- A concept unique to this skill with no external documentation
- Primary use case the skill exists for (the thing agents reach for most often)
Everything else belongs in the appropriate references/ file. See the playbook decision tree for concrete examples of what typically stays vs. moves.
Description compression rules:
- Lead with the package/tool name and a one-line identity
- Replace enumerations of 4+ specific names (APIs, checks, steps) with category-based phrasing (e.g., "hooks for auth, sessions, tokens" instead of listing each hook name)
- Qualify generic keywords with the skill's domain to reduce false positives (e.g., "MyLib integrations with Redis" not "Redis integration")
- Merge items that share a theme into a single line (e.g., "error handling" + "retry logic" → "Error handling and retry logic in MyLib")
- Verify every original trigger category maps 1:1 to the compressed version — no categories dropped
Plan the Reference Guide section — for each reference file, write a one-line description of when to read it. This section is load-bearing: it tells agents which file to consult.
Target metrics:
- Body: under ~250 lines
- Description: under ~700 characters
- Duplication with references: 0%
Phase 3: Execute
Apply the plan. Work in this order:
- Compress the description — rewrite the YAML
descriptionfield. Keep all trigger categories; do not remove any "when to use" signals. - Remove duplicate sections from body — delete sections already covered in references.
- Add the Reference Guide section — add explicit pointers to each reference file with descriptions. See the playbook for the recommended format.
- Add a Maintenance Note — add a note at the bottom of the body with: (a) the body-line budget (~250 lines), (b) a pointer to the ADR if one exists, and (c) a one-sentence rationale for the split. See the playbook template.
- Bump version — increment
metadata.versionminor if the skill uses versioning.
Do NOT:
- Move "when to use" triggers from description to body (description is the only field read for triggering)
- Remove code examples from retained body sections (they are the value)
- Create new reference files just to move content — use existing files when possible
- Add content that duplicates what is already in references
Phase 4: Validate
Use the Task tool to spawn a subagent (opus model) to challenge coverage. Provide it:
- The full SKILL.md (body + frontmatter)
- All reference files
- A list of 15-25 questions the skill must answer (provided by the user, or derived from trigger categories — see the playbook for derivation rules)
The subagent evaluates each question:
- From SKILL.md alone: YES / PARTIAL / NO
- From SKILL.md + references: YES / PARTIAL / NO
- Gap: content missing from ALL files
Pass criteria:
- 0 regressions (nothing answerable before that isn't answerable after)
- All trigger categories in the description still present
- Body under ~250 lines
If gaps are found, determine whether they are pre-existing (never covered) or regressions (lost during optimization). Only regressions require fixes — restore or rewrite the missing content in the body or appropriate reference file, then re-evaluate only the affected questions.
Fallback: If subagent spawning is unavailable, self-evaluate: for each question, attempt to answer it using only the optimized files and rate confidence as HIGH / MEDIUM / LOW. Any LOW-confidence answer on a question that was previously answerable is a regression.
Output
After validation, produce a summary table:
| Metric | Before | After | Change |
|-------------------|--------|--------|--------|
| Description chars | ??? | ??? | -??% |
| Body lines | ??? | ??? | -??% |
| Body tokens (est.)| ??? | ??? | -??% |
| Duplication % | ??? | 0% | -??% |
| Regressions | n/a | 0 | |
Reference
For detailed checklists, before/after examples, and the full validation methodology, see optimization-playbook.md.
Maintenance Note
Body budget: ~120 lines (general target for optimized skills: ~250). The optimization workflow and decision rules are the core value and stay in the body; expanded examples, checklists, and the decision tree live in the playbook reference.