skill-creator
Pass
Audited by Gen Agent Trust Hub on Mar 10, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill utilizes the
subprocessmodule to execute local Python scripts, theclaudeCLI tool, and system utilities such aslsofandkillto manage the benchmarking lifecycle and the local evaluation viewer server.- [PROMPT_INJECTION]: The skill implements an evaluation workflow where user-provided test prompts are passed directly to subagents for execution. This ingestion of untrusted data without explicit sanitization or boundary markers presents a surface for indirect prompt injection, as malicious test cases could attempt to subvert subagent logic or safety constraints.
Audit Metadata