skill-creator

Pass

Audited by Gen Agent Trust Hub on Mar 29, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes external commands via subprocess.Popen in scripts/run_eval.py and subprocess.run in eval-viewer/generate_review.py. These calls invoke the claude CLI to perform evaluations and system utilities like lsof and kill to manage the local web server for the viewer. While integral to the skill's purpose, they grant the agent significant system-level capabilities.- [PROMPT_INJECTION]: The skill exhibits an indirect prompt injection surface by processing external, potentially untrusted data. It incorporates drafted skill instructions, test execution results, and user feedback into prompts for specialized subagents (graders and optimizers).
  • Ingestion points: Data is read from evals/evals.json, current skill files, and the grading.json output of previous runs.
  • Boundary markers: Skill content is interpolated into templates in scripts/improve_description.py and instructions in agents/grader.md without explicit delimiters or instructions for the model to ignore potentially malicious content within the injected text.
  • Capability inventory: The skill environment possesses file writing permissions, the ability to execute shell commands via subprocess, and the capacity to make outbound network requests to the Anthropic API.
  • Sanitization: Skill content and evaluation results are used in their raw form when constructing prompts for the language model.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 29, 2026, 06:34 AM