skill-creator

Warn

Audited by Gen Agent Trust Hub on Mar 4, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/run_eval.py uses subprocess.Popen to invoke the claude CLI tool on the host system. Additionally, eval-viewer/generate_review.py executes subprocess.run with the lsof command to manage local network ports for the evaluation viewer.
  • [REMOTE_CODE_EXECUTION]: The skill dynamically constructs and writes temporary skill definition files to the project's .claude/commands/ directory. These files contain logic and metadata that are interpreted and executed by the claude CLI during trigger evaluation runs.
  • [EXTERNAL_DOWNLOADS]: The evaluation report template (eval-viewer/viewer.html) fetches the SheetJS (xlsx) library from cdn.sheetjs.com via a script tag. This external dependency is used to render spreadsheet data within the local browser interface.
  • [PROMPT_INJECTION]: The skill is designed to process and benchmark untrusted skill drafts and test prompts, presenting an attack surface for indirect prompt injection.
  • Ingestion points: Evaluation prompts in evals/evals.json and skill instructions in SKILL.md are processed and executed.
  • Boundary markers: The skill uses standard Markdown and YAML delimiters which do not prevent the execution of instructions embedded within the untrusted content.
  • Capability inventory: The system can execute local shell commands, write files to the local filesystem, and spawn autonomous subagents.
  • Sanitization: There is no evidence of content sanitization or validation performed on user-provided prompts or skill instructions before they are processed by the execution engine.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 4, 2026, 03:56 PM