Skill Optimizer

Reduce skill token cost without losing coverage. Every token in SKILL.md body is paid per conversation — references/ files are loaded on-demand.

Optimization Workflow

Phase 1: Analyze

Measure the current skill before changing anything.

Count SKILL.md body lines (exclude frontmatter) and estimate tokens (~4.5 tokens/line for mixed code/prose).
Count description characters.
List every references/ file with line counts.
Identify duplication: for each body section (at any heading level), check if the same concept or procedure is also covered in a reference file. Count those body lines and divide by total body lines for the overlap percentage.
List nouns from the description that appear verbatim in a sibling skill's description — these need domain-qualifying in Phase 2.

If the skill has no references/ directory, optimization may require creating reference files first. See the playbook for guidance on this edge case.

Output a table:

| Metric              | Current |
|---------------------|---------|
| Description chars   | ???     |
| Body lines          | ???     |
| Body tokens (est.)  | ???     |
| Duplication %       | ???     |
| Reference files     | ???     |

Phase 2: Plan

Decide what stays in the body, what moves to references, and what gets compressed.

Body retention criteria — keep a section in the body ONLY if it meets at least one:

Complex multi-step pattern requiring coordination across multiple sections or files
Non-obvious logic, parameters, or decision rules that agents frequently get wrong without inline guidance
A concept unique to this skill with no external documentation
Primary use case the skill exists for (the thing agents reach for most often)

Everything else belongs in the appropriate references/ file. See the playbook decision tree for concrete examples of what typically stays vs. moves.

Description compression rules:

Lead with the package/tool name and a one-line identity
Replace enumerations of 4+ specific names (APIs, checks, steps) with category-based phrasing (e.g., "hooks for auth, sessions, tokens" instead of listing each hook name)
Qualify generic keywords with the skill's domain to reduce false positives (e.g., "MyLib integrations with Redis" not "Redis integration")
Merge items that share a theme into a single line (e.g., "error handling" + "retry logic" → "Error handling and retry logic in MyLib")
Verify every original trigger category maps 1:1 to the compressed version — no categories dropped

Plan the Reference Guide section — for each reference file, write a one-line description of when to read it. This section is load-bearing: it tells agents which file to consult.

Target metrics:

Body: under ~250 lines
Description: under ~700 characters
Duplication with references: 0%

Phase 3: Execute

Apply the plan. Work in this order:

Compress the description — rewrite the YAML description field. Keep all trigger categories; do not remove any "when to use" signals.
Remove duplicate sections from body — delete sections already covered in references.
Add the Reference Guide section — add explicit pointers to each reference file with descriptions. See the playbook for the recommended format.
Add a Maintenance Note — add a note at the bottom of the body with: (a) the body-line budget (~250 lines), (b) a pointer to the ADR if one exists, and (c) a one-sentence rationale for the split. See the playbook template.
Bump version — increment metadata.version minor if the skill uses versioning.

Do NOT:

Move "when to use" triggers from description to body (description is the only field read for triggering)
Remove code examples from retained body sections (they are the value)
Create new reference files just to move content — use existing files when possible
Add content that duplicates what is already in references

Phase 4: Validate

Use the Task tool to spawn a subagent (opus model) to challenge coverage. Provide it:

The full SKILL.md (body + frontmatter)
All reference files
A list of 15-25 questions the skill must answer (provided by the user, or derived from trigger categories — see the playbook for derivation rules)

The subagent evaluates each question:

From SKILL.md alone: YES / PARTIAL / NO
From SKILL.md + references: YES / PARTIAL / NO
Gap: content missing from ALL files

Pass criteria:

0 regressions (nothing answerable before that isn't answerable after)
All trigger categories in the description still present
Body under ~250 lines

If gaps are found, determine whether they are pre-existing (never covered) or regressions (lost during optimization). Only regressions require fixes — restore or rewrite the missing content in the body or appropriate reference file, then re-evaluate only the affected questions.

Fallback: If subagent spawning is unavailable, self-evaluate: for each question, attempt to answer it using only the optimized files and rate confidence as HIGH / MEDIUM / LOW. Any LOW-confidence answer on a question that was previously answerable is a regression.

Output

After validation, produce a summary table:

| Metric            | Before | After  | Change |
|-------------------|--------|--------|--------|
| Description chars | ???    | ???    | -??%   |
| Body lines        | ???    | ???    | -??%   |
| Body tokens (est.)| ???    | ???    | -??%   |
| Duplication %     | ???    | 0%     | -??%   |
| Regressions       | n/a    | 0      |        |

Reference

For detailed checklists, before/after examples, and the full validation methodology, see optimization-playbook.md.

Maintenance Note

Body budget: ~120 lines (general target for optimized skills: ~250). The optimization workflow and decision rules are the core value and stay in the body; expanded examples, checklists, and the decision tree live in the playbook reference.

workleap-skill-optimizer