skill-creator

Warn

Audited by Gen Agent Trust Hub on Mar 23, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill frequently executes shell commands via subprocess. scripts/run_eval.py and scripts/improve_description.py call the claude CLI to perform evaluations and improvements. Additionally, eval-viewer/generate_review.py executes lsof and uses os.kill to terminate processes on specific network ports.
  • [REMOTE_CODE_EXECUTION]: The script scripts/run_eval.py dynamically creates skill definition files (Markdown with YAML frontmatter) inside the .claude/commands/ directory. These files are then automatically loaded and interpreted by the system's agent as executable skills, which is a form of dynamic code/instruction generation and execution.
  • [DATA_EXFILTRATION]: The eval-viewer/generate_review.py script starts a local HTTP server bound to 127.0.0.1. This server serves the entire contents of the workspace directory, including execution transcripts, output files, and metadata. While restricted to localhost, this exposes all evaluation data to any other process or user on the local machine.
  • [PROMPT_INJECTION]: The skill is designed to ingest and process external SKILL.md and evals.json files. These untrusted contents are interpolated into prompts for the primary agent and multiple specialized subagents (grader, analyzer, comparator). This creates an attack surface for indirect prompt injection, where a malicious skill being analyzed could override the instructions of the Skill Creator or its subagents.
  • [DYNAMIC_EXECUTION]: In eval-viewer/generate_review.py, the server generates HTML that embeds workspace data directly into JavaScript variables (const EMBEDDED_DATA = ...). If the outputs of the analyzed skill contain malicious scripts, they could potentially execute in the context of the local viewer.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 23, 2026, 08:01 PM