skill-creator
Pass
Audited by Gen Agent Trust Hub on Mar 16, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill implements a robust 'skill-creator' workflow that includes drafting, testing with subagents, human feedback loops, and automated optimization of skill descriptions.
- [SAFE]: External dependencies are managed through standard scripts and utilities provided within the skill's own package (e.g.,
scripts/aggregate_benchmark.py,scripts/run_loop.py,eval-viewer/generate_review.py). - [SAFE]: The subagent spawning mechanism is used legitimately for running parallel evaluations (with-skill vs. baseline) to measure performance improvements accurately.
- [SAFE]: Data handling is restricted to the local workspace for storing test outputs, grading results, and benchmark statistics. No sensitive file access or non-whitelisted network operations were found.
- [SAFE]: The description optimization loop uses legitimate calls to the Anthropic API (via standard clients) to propose and test improved skill descriptions based on evaluation failures.
- [SAFE]: Security instructions within the skill body (Principle of Lack of Surprise) explicitly prohibit the creation of malicious skills, exploit code, or tools for unauthorized access.
Audit Metadata