skill-creator
This skill uses Claude hooks which can execute code automatically in response to events. Review carefully before installing.
Skill Creator
Create, improve, and audit AI agent skills. Every skill follows 14 proven structural patterns.
Scope: Skills only. NOT for creating agents (wagents new agent), building MCP servers (/mcp-creator), or running existing skills. This repo uses raw SKILL.md format committed directly to skills/.
Dispatch
| $ARGUMENTS | Action | Example |
|---|---|---|
create <name> / new <name> |
Develop (new) | /skill-creator create my-analyzer |
create <name> --from <source> |
Develop (new, from exemplar) | /skill-creator create my-analyzer --from wargame |
improve <name> / improve <path> |
Develop (existing) | /skill-creator improve add-badges |
plan <name> / plan <path> |
Plan (existing) | /skill-creator plan honest-review |
plan --all / plan repo |
Plan (repo-wide) | /skill-creator plan --all |
audit <name> |
Audit | /skill-creator audit honest-review |
audit --all |
Audit All | /skill-creator audit --all |
dashboard |
Dashboard | /skill-creator dashboard |
package <name> / package --all |
Package | /skill-creator package wargame |
| Natural language skill idea | Auto: Develop (new) | "tool that audits Python type safety" |
| Skill name + modification verb | Auto: Develop (existing) | "refactor the wargame skill" |
| Path to SKILL.md | Auto: Develop (existing) | skills/wargame/SKILL.md |
| "MCP server" / "agent" / "run" | Refuse + redirect | — |
| Empty | Gallery | /skill-creator |
Auto-Detection Heuristic
If no explicit mode keyword is provided:
- Path ending in
SKILL.mdor directory underskills/→ Develop (existing) - Existing skill name + modification verb (improve, refactor, enhance, update, fix, rewrite, optimize, polish, revise, change) → Develop (existing)
--from <source>in arguments → Develop (new, from exemplar)- New capability description ("I want to build...", "tool that...", "skill for...") → Develop (new) — derive name, confirm before scaffolding
- "MCP server", "agent", "run" → refuse gracefully and redirect
- Ambiguous → ask the user which mode they want
Quick Start
wagents new skill <name> # Scaffold from template
wagents validate # Check all skills
wagents eval validate # Check eval manifests after eval changes
uv run python skills/skill-creator/scripts/audit.py skills/<name>/ # Score quality
wagents package <name> --dry-run # Check single-skill portability before packaging
Skill Development
Unified process for creating new skills and improving existing ones. Load references/workflow.md for the full procedure.
| Step | New Skill | Existing Skill |
|---|---|---|
| 1. Understand | Define use cases, scope, patterns | Audit + understand user's intent |
| 2. Plan | Structure, description, frontmatter | Gap analysis + improvement plan (approval gate) |
| 3. Scaffold | wagents new skill <name> |
Skip |
| 4. Build | Write/edit body, references, scripts, templates, evals | Same |
| 5. Validate | wagents validate + wagents eval validate + audit.py |
Same |
| 6. Iterate | Test, identify issues, loop to Step 4 | Same |
Repo-Wide / Multi-Skill Planning
Use plan <name> for an existing-skill refinement plan without editing and plan --all or plan repo for a ranked repo-wide planning pass.
Required planning output:
- baseline audit summary
- highest-value findings
- explicit file targets
- expected score impact
- approval gate before any edits
For repo-wide planning, produce a ranked queue plus one standalone refinement plan per promoted skill or skill cluster. Do not edit any skill until the user approves the plan.
Load references/refinement-plan.md when producing the standalone refinement-plan packet.
Audit
Score a skill using deterministic analysis + AI review. Load references/audit-guide.md.
Audit All
Comparative ranking of all repository skills. Load references/audit-guide.md § Audit All.
Dashboard
Render visual creation process monitor or audit quality dashboard. Load references/audit-guide.md § Dashboard.
Auto-detects mode from data: phases field → process monitor; skills array → audit overview.
Gallery (Empty Arguments)
Present skill inventory with scores and available actions.
Run uv run python skills/skill-creator/scripts/audit.py --all --format table, display results, offer mode menu.
Package
Package skills into portable ZIP files for Claude Code Desktop import. Load references/packaging-guide.md for ZIP structure, manifest schema, portability checks, and cross-agent compatibility.
wagents package <name> --dry-run # Check a single skill before emitting a ZIP
wagents package <name> # Single skill → <name>-v<version>.skill.zip
wagents package --all # All skills → dist/ with manifest.json
wagents package --all --dry-run # Check portability without creating ZIPs
Hooks
PreToolUse hooks intercept tool calls during skill execution. The hooks: frontmatter field scopes hooks to this skill only — they activate when the skill is loaded and deactivate when it completes.
Post-edit enforcement for this skill:
SKILL.mdedits triggeruv run wagents validateevals/*.jsonedits triggeruv run wagents eval validate- hook-bearing skill/settings edits trigger
uv run wagents hooks validate - failures surface to the agent instead of being swallowed
Stop hook enforcement:
- runs
uv run python skills/skill-creator/scripts/verify.py stop - validates dirty skill-definition, eval, and hook surfaces before exit
- exits immediately when hook input has
stop_hook_active: trueto avoid recursive Stop-hook loops
State Management
Creation progress persists at ~/.{gemini|copilot|codex|claude}/skill-progress/<name>.json. Read/write via scripts/progress.py. Survives session restarts. Use --state-dir to override the default location.
Reference File Index
| File | Content | Read When |
|---|---|---|
references/workflow.md |
Unified 6-step skill development process for new and existing skills | Develop (new), Develop (existing) |
references/refinement-plan.md |
Standalone refinement-plan contract for existing-skill and repo-wide planning output | Plan (existing), Plan (repo-wide) |
references/audit-guide.md |
Audit procedure, Audit All, Dashboard rendering, Gallery, grade thresholds | Audit, Audit All, Dashboard, Gallery |
references/proven-patterns.md |
14 structural patterns with examples from repo skills | Step 4 (Build), gap analysis |
references/best-practices.md |
Anthropic guide + superpowers methodology + cross-agent awareness | Step 2 (Plan), Step 4 (Build), description writing |
references/frontmatter-spec.md |
Full field catalog, invocation matrix, decision tree | Step 3 (Scaffold), frontmatter configuration |
references/packaging-guide.md |
ZIP structure, manifest schema, portability checks, import instructions | Package |
references/evaluation-rubric.md |
13 weighted scoring dimensions normalized to 100, grade thresholds, pressure testing | Audit (pressure testing), scoring targets |
Read reference files as indicated by the "Read When" column above. Do not rely on memory or prior knowledge of their contents.
Core Principles
Conciseness is respect — The context window is shared. Every line competes with the agent's working memory. Earn every line or delete it.
Progressive disclosure — Frontmatter for discovery (~100 tokens), body for dispatch (<5K tokens), references for deep knowledge (on demand), scripts/templates for execution (never loaded).
Self-exemplar — This skill follows every pattern it teaches. When in doubt, look at how skill-creator applies it.
Critical Rules
- Run
uv run wagents validatebefore declaring any skill complete - Run
uv run wagents eval validateafter changing evals and before declaring the skill complete - Run
uv run python skills/skill-creator/scripts/audit.pyafter every significant SKILL.md change - Never create a skill without a dispatch table — it is the routing contract
- Never create a dispatch table without an empty-args handler — unrouted input is a bug
- Every reference file must appear in the Reference File Index — orphan refs are invisible
- Every indexed reference must exist on disk — phantom refs cause agent errors
- Body must stay under 500 lines (below frontmatter) — move detail to references
- Description must include "Use when" trigger phrases AND "NOT for" exclusions
- Names must be kebab-case, 2-64 chars, no consecutive hyphens, no reserved words
- Scripts use argparse + JSON to stdout — no custom output formats
- Templates are self-contained HTML with no external dependencies
- Do NOT call
wagents docs generate— delegate to docs-steward - Do NOT create agents or MCP servers — refuse gracefully and redirect
- Improving existing skills requires presenting an improvement plan and getting user approval before implementing changes
- Audit mode is read-only — never modify the skill being audited
- Update evals when dispatch behavior or modes change — stale evals are invisible bugs
plan <name>andplan --allare read-only planning modes — never edit during planning- Repo-wide or multi-skill requests require a ranked plan and standalone refinement-plan output before any implementation begins
- Stop hooks must include a
stop_hook_activeguard — recursive hook loops are implementation bugs
Canonical terms (use these exactly throughout):
- Modes: "Develop (new)", "Develop (existing)", "Audit", "Audit All", "Dashboard", "Package", "Gallery"
- Steps (Development): "Understand", "Plan", "Scaffold", "Build", "Validate", "Iterate"
- Grade scale: "A" (90-100), "B" (75-89), "C" (60-74), "D" (40-59), "F" (<40)
- Patterns: "dispatch-table", "reference-file-index", "critical-rules", "canonical-vocabulary", "scope-boundaries", "classification-gating", "scaling-strategy", "state-management", "scripts", "templates", "hooks", "progressive-disclosure", "body-substitutions", "stop-hooks"
- Audit dimensions: "frontmatter", "description", "dispatch-table", "body-structure", "pattern-coverage", "reference-quality", "critical-rules", "script-quality", "portability", "conciseness", "canonical-vocabulary", "evaluation-coverage", "validation-contract"