skill-evolver
Skill Evolver
Analyze skill execution traces to discover issues, identify improvement opportunities, and apply fixes to skill files.
Trace Format
Traces are JSON with this structure:
{
"id": "uuid",
"request": "user's original request",
"skills_used": ["skill-name"],
"success": true/false,
"total_turns": 2,
"total_input_tokens": 5000,
"total_output_tokens": 200,
"duration_ms": 7000,
"steps": [
{"role": "assistant", "content": "...", "tool_name": null},
{"role": "tool", "tool_name": "...", "tool_input": {}, "tool_result": "..."}
],
"llm_calls": [
{"turn": 1, "stop_reason": "tool_use", "input_tokens": 2500, "output_tokens": 50}
]
}
Workflow
This skill can receive two types of input (at least one required):
- Traces: Execution trace data from real skill runs — provides data-driven problem discovery
- Feedback: User-written improvement suggestions — provides directed guidance for changes
When both are provided, combine insights: use traces to validate/discover issues and feedback to prioritize and guide fixes.
Step 1: Analyze Inputs
If traces are provided, run the analysis script:
scripts/analyze_traces.py <traces.json> [--skill <name>] [--format json|text]
Output includes:
- Success rate
- Average turns, duration, tokens
- Common issues and warnings
- Recommendations
If feedback is provided, identify the user's improvement goals and map them to actionable changes.
If both are provided, cross-reference: does the feedback align with trace-discovered issues? Use feedback to prioritize which trace-identified problems to fix first.
Step 2: Extract Issue Details
For failed or problematic traces, extract full context:
scripts/extract_issue_context.py <traces.json> --failed
scripts/extract_issue_context.py <traces.json> --trace-id <id> --show-llm
scripts/extract_issue_context.py <traces.json> --high-turns
Skip this step if only feedback was provided (no traces).
Step 3: Identify Root Causes
Map issues to skill components using references/issue-patterns.md:
| Issue Type | Likely Fix Location |
|---|---|
| execution_failure | scripts/, error handling |
| high_turn_count | SKILL.md clarity, add examples |
| tool_errors | scripts/, input validation |
| high_token_usage | SKILL.md verbosity, progressive disclosure |
| repeated_tool_calls | SKILL.md decision trees |
For feedback-only input, map the user's suggestions directly to the appropriate skill components.
Step 4: Apply Fixes
Read the target skill and apply changes based on analysis:
- For script errors: Fix scripts, add validation, improve error messages
- For efficiency issues: Add examples, decision trees, clearer instructions
- For token issues: Reduce SKILL.md, move content to references/
- For trigger issues: Update frontmatter description
- For feedback-guided changes: Apply the user's specific suggestions
Scope constraints — strictly follow:
- Only modify the target skill's existing files (SKILL.md, scripts/, references/)
- Do NOT create new reference files, templates, or guides
- Do NOT search the web for domain-specific content
- Do NOT generate CHANGELOG, improvement reports, or other extra deliverables
- The evolved skill files themselves are the sole deliverable
Quick Reference
Issue Severity Levels
- high: Failures, max_tokens, tool errors → Fix immediately
- medium: High turns, high tokens, retries → Optimize
- low: Long duration → Consider optimization
Key Metrics Thresholds
| Metric | Warning | Action |
|---|---|---|
| success_rate | <90% | Review failures |
| avg_turns | >4 | Simplify workflow |
| avg_tokens | >30000 | Reduce context |
| duration_ms | >60000 | Optimize scripts |
Common Fixes
Low success rate:
- Add error handling in scripts
- Add input validation
- Clarify ambiguous instructions
High turn count:
- Add decision tree
- Provide more examples
- Use scripts for multi-step operations
High token usage:
- Reduce SKILL.md lines (<500)
- Move details to references/
- Remove redundant examples
More from dp-archive/archive
skill-finder
Helps users discover and install agent skills from the open skills ecosystem (skills.sh). Use when users ask 'how do I do X', 'find a skill for X', 'is there a skill that can...', want to search for tools/templates/workflows, or express interest in extending agent capabilities.
12brand-identity
>
12code-to-diagram
Generate architecture diagrams, ER diagrams, sequence diagrams, flowcharts, and class diagrams from codebases using Mermaid.js. Use when users ask to visualize code structure, draw architecture diagrams, create ER diagrams from database models, generate sequence diagrams from API flows, or produce any diagram from source code. Triggers on: 'draw architecture', 'generate diagram', 'visualize code', 'ER diagram', 'sequence diagram', 'class diagram', 'flowchart from code', 'module dependency graph'.
10canvas-design
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.
8humanizer
|
8skills-planner
Plan which skills are needed to fulfill user requirements. Use when the user wants to design an agent workflow, plan skill composition, or determine what skills are needed for a task. Input includes user requirements and existing skills list. Output includes recommended existing skills, new skills to create, and a system prompt for the composed agent.
7