General Skill Refiner

Overview

Purpose: Critical analysis and improvement of agent skills

Approach:

Ruthless critique - identify all issues without sugar-coating
Clear priorities - MUST/SHOULD/NICE to have classification
Concrete solutions - specific fixes, not just complaints
User feedback loop - user decides what to fix
Verify changes - ensure quality after refactoring

Output: Improved skill + change report in .tasks/skill-refactoring-[date]/

Guidelines

What Makes a Good Skill

Based on Agent Skills Complete Guide:

Description is king - Most important field for skill triggering
Progressive disclosure - SKILL.md <500 lines, detailed docs in references/
Structure > prose - Numbered steps, bullet points, clear conditionals
Only add what LLM doesn't know - No basic programming tutorials
One skill = one domain - Focused scope, not "everything-tool"
Include examples - Show input → output patterns
No time estimates - Never mention how long things take

Common Anti-Patterns to Look For

❌ The Encyclopedia - 5000+ line SKILL.md (should split to references/)
❌ The Everything Bagel - Skill applies to every task (should be a rule)
❌ The Secret Handshake - Agent never loads skill (bad description)
❌ The Fragile Skill - Hardcoded absolute paths (use relative paths)
❌ The Wall of Text - Unstructured prose (needs headers/lists/steps)
❌ Time Estimates - Any mention of duration (strictly forbidden)

Priority Levels

🔴 PRIORITY 1: MUST FIX (breaking issues)

Core guideline violations (time estimates, etc.)
Structural problems (too long, disorganized)
Missing critical functionality
Contradictions or confusion

🟡 PRIORITY 2: SHOULD FIX (quality issues)

Suboptimal workflow
Redundant content
Missing helpful features
Can be simplified

🟢 PRIORITY 3: NICE TO HAVE (enhancements)

Small improvements
Additional examples
Minor clarifications

Examples

Example 1: Identifying Time Estimates Problem

Skill analyzed: code-review

Problem found:

Line 45: "This analysis should take about 5-10 minutes"
Line 89: "Quick scan (2-3 minutes)"

Priority: 🔴 MUST FIX - Core guideline violation

Fix: Remove all time references:

Line 45: "This analysis covers security, performance, and style"
Line 89: "Quick scan of critical issues"

Example 2: Analysis Report Format

For skill: pdf-processor (1200 lines in SKILL.md)

Analysis presented to user:

## 🔴 MUST FIX

### 1. SKILL.md Too Long (1200 lines)
**Problem:** Exceeds recommended 500 line limit
**Impact:** Loads unnecessary content into context
**Fix:** Move detailed API docs to references/api-reference.md

### 2. Time Estimates Found
**Problem:** Lines 234, 456, 789 contain time estimates
**Fix:** Remove phrases "takes 2 minutes", "quick 30 second scan"

## 🟡 SHOULD FIX

### 3. Missing Examples Section
**Problem:** No concrete input → output examples
**Fix:** Add examples showing PDF processing workflow

## 🟢 What's Good
- Clear step-by-step workflow
- Good error handling coverage
- Scripts are well-documented

Example 3: Before/After Refactoring

Before:

## How to Use

So basically when you want to process a PDF you should first
check if the file exists and then you know, read it and extract
the text content which might take a few minutes depending on the
size, and then process it according to what the user needs...

Line count: 15 lines of prose
Structure: Wall of text
Time estimates: Yes (forbidden)

After:

## How to Use

**Process PDF files:**

1. Validate PDF exists
2. Extract text using `scripts/extract.py`
3. Parse output for required format
4. Return processed content

**For detailed extraction options, see [references/extraction-guide.md](references/extraction-guide.md)**

Line count: 11 lines
Structure: Clear numbered steps
Time estimates: None ✅

Workflow

Phase 1: Read & Understand

Cel: Dogłębnie zrozumieć skill i jego strukturę.

Kroki:

Read main SKILL.md:
- Zrozum cel i workflow
- Zmierz długość (wc -l)
- Zidentyfikuj główne phases
Read wszystkie references:
- Sprawdź co jest w references/
- Zmierz długość każdego pliku
- Zrozum jak references wspierają main skill
Compare z innymi skillami:
- Porównaj długość SKILL.md z innymi skillami
- Sprawdź total długość references/
- Zidentyfikuj pattern differences

Output: Pełne zrozumienie skilla i jego kontekstu.

Phase 2: Critical Analysis

Goal: Ruthlessly identify all problems and group by priority

Analysis Checklist:

Use references/quality-criteria.md for complete criteria. Key checks:

Structure & Length:

SKILL.md line count (target: <500 lines)
Total references/ size
Clear sections with headers
Numbered steps for procedures
Bullet points for criteria

Content Quality:

No time estimates anywhere (NEVER allowed)
No "wall of text" sections (needs structure)
No redundant content between SKILL.md and references/
Examples included (input → output)
Only adds what LLM doesn't know

Description & Triggering:

Description contains specific keywords
Description explains WHEN to use
Triggers match how users talk about task
Not too broad ("everything-tool")

Workflow & Features:

Clear step-by-step workflow
No missing critical features
No contradictions or confusion
Proper references to supporting files

For Each Issue Found:

Identify: What is the specific problem? (with line numbers)
Classify: 🔴 MUST / 🟡 SHOULD / 🟢 NICE priority
Explain: Why is this a problem?
Propose: Concrete solution (not just "fix this")

Output: Complete prioritized list of issues with solutions

Phase 3: Present & Gather Feedback

Cel: Zaprezentować analizę użytkownikowi i zebrać feedback co poprawiać.

Presentation format:

## 🔴 Główne problemy (MUST FIX)

### 1. **[Problem name]**
**Problem:** [clear description]
**Konkretnie:** [specific examples with line numbers]
**Fix:** [concrete solution]

### 2. **[Problem name]**
...

## 🟡 Średnie problemy (SHOULD FIX)
[...]

## 🟢 Co jest dobre
[List positive aspects - important for balance]

## 💡 Sugestie poprawek

**Priority 1 (MUST fix):**
1. [Fix 1]
2. [Fix 2]

**Priority 2 (SHOULD fix):**
[...]

**Priority 3 (NICE to have):**
[...]

Zapytaj użytkownika:

"Zgadzasz się z tą analizą?"
"Czy są rzeczy z którymi się nie zgadzasz?"
"Powinienem wprowadzić te poprawki?"
"Czy jest coś specyficznego co chcesz zachować/zmienić?"

Listen for:

Co użytkownik zgadza się poprawić
Co użytkownik chce zachować (nawet jeśli jest suboptimal)
Dodatkowe insights od użytkownika

Output: Jasna lista co poprawiać z user approval.

Phase 4: Refactor

Cel: Systematycznie wprowadzić poprawki zgodnie z priorytetami i feedbackiem.

Refactoring workflow:

1. Start with Priority 1 issues:

Fix one issue at a time
Verify każdą zmianę
Don't break other things

2. Then Priority 2:

Continue systematically
Show progress

3. Priority 3 if time:

Only if user wants
Quick wins first

Refactoring patterns:

Use references/refactoring-patterns.md:

How to remove time estimates
How to shorten SKILL.md (move to references)
How to simplify question flows
How to add missing features
How to improve structure

Best practices:

Make atomic changes
Test that files are valid
Keep backups (don't worry, git)
Verify line counts after changes

Track changes: Create log in .tasks/skill-refactoring-[skill-name]-[date]/changes.md:

What was changed
Why
Before/after metrics

Phase 5: Verify & Report

Cel: Sprawdzić że wszystko działa i podsumować zmiany.

Verification checklist:

✅ Files are valid:

SKILL.md syntax OK
All references exist
No broken links

✅ Metrics improved:

SKILL.md shorter (if that was goal)
No time estimates
Better structure

✅ Quality checklist passed:

Run through quality-criteria.md
Wszystkie MUST fixes done
SHOULD fixes addressed

Report to user:

"Gotowe! Poprawiłem skill [name].

Główne zmiany:

[Change 1] - [metric before → after]
[Change 2] - [metric before → after]
[Change 3]

Metryki:

SKILL.md: [X] → [Y] linii
References: [X] → [Y] linii total
Issues fixed: [Priority 1: X, Priority 2: Y]

Co zostało poprawione: ✅ [Issue 1] ✅ [Issue 2] ✅ [Issue 3]

Co jest lepsze:

[Improvement 1]
[Improvement 2]

Szczegółowy raport w .tasks/skill-refactoring-[name]-[date]/"

Zapytaj:

"Czy chcesz żebym przejrzał jeszcze raz?"
"Czy są dodatkowe poprawki?"

Special Cases

User disagrees z analizą

To OK - user ma final say
Explain reasoning ale respect decision
Document why recommendation was made
Proceed with user's preferences

Skill jest fundamentalnie broken

Be honest: "Ten skill wymaga przepisania od zera"
Explain dlaczego
Zaproponuj: refactor vs rewrite from scratch
Let user decide

Multiple skille do poprawy

Jeden na raz
Priorytetyzuj który najpierw (ask user)
Apply learnings z jednego do innych

Refactoring reveals deeper issues

Stop and inform user
"Zauważyłem [deeper issue] - powinienem to też naprawić?"
Get approval before expanding scope

Quality Checklist

Przed zakończeniem, upewnij się że:

✅ Analysis was thorough: Checked all aspects z quality-criteria.md ✅ Problems prioritized: Clear MUST/SHOULD/NICE to have ✅ User feedback gathered: User approved changes ✅ Changes implemented: All agreed fixes done ✅ No time estimates: Removed all time references ✅ Structure improved: SKILL.md is clear and not too long ✅ References optimized: Supporting files helpful, not overwhelming ✅ Changes documented: Log created with before/after ✅ Verification done: Quality checklist passed ✅ User satisfied: Final approval received

Key Reminders

DO:

Be ruthlessly critical in Phase 2
Prioritize problems clearly (MUST/SHOULD/NICE)
Give concrete solutions, not just complaints
Get user feedback before big changes
Make atomic, verifiable changes
Document what you changed and why
Verify quality after refactoring

DON'T:

Don't sugarcoat problems
Don't fix without understanding
Don't change everything at once
Don't skip user feedback
Don't ignore user preferences
Don't forget to verify afterwards
Don't leave broken files

Twoje podejście: Jesteś bezwzględnym code reviewer który chce żeby skill był najlepszy jaki może być. Identifikujesz problemy, proponujesz rozwiązania, ale ostatecznie user decyduje co poprawiać.

Pamiętaj:

Skill quality matters - bad skills = bad results
Be specific - "line 45 has time estimate" not "too many time estimates"
Priorities are key - fix breaking issues first
User knows their use case - respect their input
Document changes - future you will thank you
Verify everything - broken skill is worse than unchanged skill

general-skill-refiner