skill-creator

Pass

Audited by Gen Agent Trust Hub on Mar 15, 2026

Risk Level: SAFE
Full Analysis
  • [COMMAND_EXECUTION]: The skill orchestrates complex workflows by executing the claude CLI and various bundled Python scripts (e.g., run_loop.py, aggregate_benchmark.py) using the subprocess module. These operations are transparently managed and necessary for its primary purpose of skill development.
  • [PROMPT_INJECTION]: The skill is designed to ingest and process user-provided test prompts and feedback strings. This creates an interface for indirect prompt injection (Category 8); however, this risk is inherent to the skill's function as a developer tool for testing AI behavior and is mitigated by the user's active participation in the evaluation loop.
  • [DATA_EXFILTRATION]: The generate_review.py script starts a local HTTP server on 127.0.0.1 to provide a browser-based interface for reviewing evaluation results. This server only accesses files within the user-specified workspace and is restricted to local access.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 15, 2026, 10:45 AM