skill-creator
Pass
Audited by Gen Agent Trust Hub on Mar 15, 2026
Risk Level: SAFE
Full Analysis
- [COMMAND_EXECUTION]: The skill orchestrates complex workflows by executing the
claudeCLI and various bundled Python scripts (e.g.,run_loop.py,aggregate_benchmark.py) using thesubprocessmodule. These operations are transparently managed and necessary for its primary purpose of skill development. - [PROMPT_INJECTION]: The skill is designed to ingest and process user-provided test prompts and feedback strings. This creates an interface for indirect prompt injection (Category 8); however, this risk is inherent to the skill's function as a developer tool for testing AI behavior and is mitigated by the user's active participation in the evaluation loop.
- [DATA_EXFILTRATION]: The
generate_review.pyscript starts a local HTTP server on127.0.0.1to provide a browser-based interface for reviewing evaluation results. This server only accesses files within the user-specified workspace and is restricted to local access.
Audit Metadata