skill-creator

Warn

Audited by Gen Agent Trust Hub on Mar 20, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The scripts run_eval.py, improve_description.py, and run_loop.py execute the system's claude CLI tool using the subprocess module. This capability is used to automate the testing and optimization of skill descriptions by running queries against the model.
  • [COMMAND_EXECUTION]: The script run_eval.py dynamically writes new skill definition files (Markdown with YAML frontmatter) to the .claude/commands/ directory. This is a sensitive system path where Claude Code discovers and loads available skills, meaning the script can modify the agent's available toolset at runtime.
  • [COMMAND_EXECUTION]: The scripts run_eval.py and improve_description.py explicitly manipulate the environment of their subprocesses to remove the CLAUDECODE environment variable. This is a deliberate bypass of a built-in safety mechanism in the Claude CLI designed to prevent recursive or nested agent execution.
  • [PROMPT_INJECTION]: The skill exhibits a surface for indirect prompt injection (Category 8). It ingests untrusted data from evals/evals.json and user-provided test prompts (Ingestion points) and interpolates them into instructions for subagents. While it uses XML-style tags as delimiters (Boundary markers), it lacks robust sanitization for the raw query strings (Sanitization). The skill possesses high-privilege capabilities including file writes to configuration paths and CLI execution (Capability inventory), which could be exploited if malicious instructions are embedded in the evaluation data.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 20, 2026, 07:39 AM