better-skill-review
Skill Review
Language
Match user's language: Respond in the same language the user uses.
Overview
Review an agent skill through three layers: automated linting (hard rules), contextual finding evaluation (agent judges with context), and structured semantic review (deep analysis against best practices). Optionally fix issues and verify fixes through independent subagent validation.
Workflow
Progress:
- Step 1: Identify the skill
- Step 2: Review (linting + profile + findings + semantic)
- Step 3: Present findings
- Step 4: Fix gate
- Step 5: Verify & iterate
Step 1: Identify the Skill
Accept the skill location as a directory path containing SKILL.md. Auto-detect if the current working directory contains one.
Step 2: Review
Run both tools, then evaluate.
2a. Automated linting:
python3 {SKILL_DIR}/scripts/validate.py run --path <skill-path>
Output: checks (hard-rule verdicts → linter grade) + findings (soft detections needing judgment).
2b. Profile extraction:
bash {SKILL_DIR}/scripts/analyze.sh analyze <skill-path>
Note skill level (l0/l0plus/l1) and feature flags.
2c. Contextual findings review:
Judge each finding using its context_hint. See references/semantic_dimensions.md § Finding Judgment Table for decision criteria. Promote real issues to warnings; dismiss the rest with a brief note.
2d. Semantic review:
Read the skill's SKILL.md fully, then score each dimension (0-3). Read references/semantic_dimensions.md for the full checklist per dimension:
| Dimension | What it evaluates |
|---|---|
| 5.1 Description Quality | Trigger phrases, length, voice, specificity |
| 5.2 Workflow Design | Steps, decision points, SKILL.md length |
| 5.3 Runtime Robustness | Preflight, degradation, troubleshooting (L0+/L1 only) |
| 5.4 Script Quality | JSON output, error handling, exit codes (scripts only) |
| 5.5 UX Practices | Language, checklist, completion report (applicability matrix) |
| 5.6 Setup Flow Integrity | Bootstrap safety, live validation, credential security |
Step 3: Present Findings
Format the report:
[Skill Review] <skill-name>
═══ Linter ═══
Grade: <letter> (<pass>/<total> passed, <warn> warnings, <fail> failures)
Failures:
✗ <check_id>: <message> → Fix: <fix>
Warnings:
⚠ <check_id>: <message>
Findings (agent-reviewed):
✓ <finding_id>: dismissed — <reason>
⚠ <finding_id>: promoted to warning — <reason>
═══ Semantic Review ═══
5.1 Description Quality: <score>/3 <one-line assessment>
5.2 Workflow Design: <score>/3 <one-line assessment>
5.3 Runtime Robustness: <score>/3 <one-line assessment>
5.4 Script Quality: <score>/3 <one-line assessment>
5.5 UX Practices: <score>/3 <one-line assessment>
5.6 Setup Flow Integrity: <score>/3 <one-line assessment>
─────────
Semantic Score: <total>/18
═══ Improvement Suggestions ═══
For each dimension scoring < 3, provide:
1. What to change and why
2. Which file to edit
3. A concrete before/after example or specific instruction
4. Priority: High (functionality/UX) / Medium (convention) / Low (polish)
If linter grade is A and semantic score ≥ 15: congratulate and suggest publishing with better-skill-publish.
Step 4: Fix Gate
Ask the user:
- Fix all — Apply all suggested changes, then auto-verify (Step 5)
- Pick and choose — Select specific items, then auto-verify (Step 5)
- None — End here, use the report as reference
After fixing, do NOT self-evaluate. Proceed directly to Step 5.
Step 5: Verify & Iterate
CRITICAL: Verification must be independent. The agent that fixed cannot judge its own work.
5a. Dispatch verification subagent:
Agent(subagent_type: "general-purpose", description: "Verify skill review", prompt: <template>)
Use the prompt template from references/semantic_dimensions.md § Verification Subagent Prompt Template. The verification subagent must:
- Run
validate.pyandanalyze.shfresh (never trust cached results) - Read
references/semantic_dimensions.mdfor scoring criteria - Score all 6 dimensions independently, citing file:line evidence
- Return linter grade + per-dimension scores + specific FAIL items
5b. Process verification result:
- All pass → Report success, done
- Has failures → Fix ONLY the specific FAIL items from the verification report, then dispatch a NEW subagent to verify again
- Max 3 rounds — if issues persist after 3 fix-verify cycles, report remaining issues and let the user decide
Anti-patterns (never do these):
| Anti-pattern | Why it's wrong | Correct approach |
|---|---|---|
| Self-evaluate after fixing | Blind to own mistakes | Always dispatch subagent |
| Reuse same subagent for re-verify | Already biased by prior assessment | New subagent each round |
| Fix without running tools first | Subjective judgment misses issues | Tools produce objective data |
| Ignore verification report | Discards independent evidence | Fix only reported FAIL items |
Linter Reference
Grading (hard-rule checks only):
- A — All pass, zero warnings
- B — All pass, some warnings
- C — 1–2 failures
- D — 3+ failures
- F — SKILL.md missing or no valid frontmatter
Check categories:
| Category | Checks |
|---|---|
| structure | SKILL.md exists, frontmatter, required fields, directory layout |
| naming | Kebab-case, length, no consecutive hyphens, matches directory |
| content | Description length, body length, heading structure |
| paths | Referenced files exist, scripts executable |
| security | No secrets, no template placeholders |
References
references/semantic_dimensions.md— Full checklist for each review dimension + finding judgment table + verification prompt templatereferences/validation_rules.md— Rationale for each linter checkreferences/improvement_patterns.md— Knowledge base of improvement patterns with examplesreferences/best_practices.md— Skill design conventions and quick reference
More from psylch/better-skills
better-skill-publish
Package a agent skill into a complete GitHub repository ready for distribution via skills.sh. Generates README, LICENSE, plugin.json, marketplace.json, .gitignore, and the proper directory structure. Optionally initializes a git repo and creates a GitHub repository. This skill should be used when publishing a skill, packaging a skill for distribution, preparing a skill repo, or when the user says 'publish skill', 'package skill', 'release skill', '发布技能', '打包 skill'.
10better-skill-creator
Create new agent skills with best-practice templates. Guides through skill level selection (L0 pure prompt, L0+ with helper scripts, L1 with business scripts), environment strategy (stdlib/uv/venv), and generates ready-to-edit project files following runtime UX best practices. This skill should be used when creating a new skill, scaffolding a skill project, initializing skill templates, or when the user says 'help me build a skill', 'create a skill', '创建技能', '新建 skill'.
10skill-review
Review a agent skill by running automated validation checks and suggesting improvements based on best practices. Combines structural validation (graded report with pass/warn/fail checks) with analytical improvement suggestions (prioritized with before/after examples). Can interactively apply fixes. This skill should be used when reviewing a skill, validating skill structure, improving skill quality, checking skill conventions, or when the user says 'review skill', 'validate skill', 'check skill', 'improve skill', 'iterate on skill', '走查技能', '验证技能', '检查 skill', '改进技能', '优化 skill'.
6skill-publish
Package a agent skill into a complete GitHub repository ready for distribution via skills.sh. Generates README, LICENSE, plugin.json, marketplace.json, .gitignore, and the proper directory structure. Optionally initializes a git repo and creates a GitHub repository. This skill should be used when publishing a skill, packaging a skill for distribution, preparing a skill repo, or when the user says 'publish skill', 'package skill', 'release skill', '发布技能', '打包 skill'.
5skill-creator
Create new agent skills with best-practice templates. Guides through skill level selection (L0 pure prompt, L0+ with helper scripts, L1 with business scripts), environment strategy (stdlib/uv/venv), and generates ready-to-edit project files following runtime UX best practices. This skill should be used when creating a new skill, scaffolding a skill project, initializing skill templates, or when the user says 'help me build a skill', 'create a skill', '创建技能', '新建 skill'.
5