skill-creator

Warn

Audited by Gen Agent Trust Hub on Apr 13, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATION
Full Analysis
  • [COMMAND_EXECUTION]: Several scripts, including scripts/run_eval.py and scripts/improve_description.py, utilize the subprocess module to execute the platform CLI and other python scripts, which are driven by dynamically generated parameters.
  • [REMOTE_CODE_EXECUTION]: The skill features an automated optimization loop that programmatically generates skill descriptions and executes them via the claude -p interface, representing the execution of dynamically created logic.
  • [INDIRECT_PROMPT_INJECTION]: The skill ingests untrusted data from evaluation sets and feedback files, which is then used to construct prompts for secondary AI agents, creating a significant injection surface.
  • Ingestion points: Evaluation queries from evals/evals.json and user reviews from feedback.json.
  • Boundary markers: The skill uses XML-style tags to encapsulate content but does not provide instructions to the agent to ignore potentially malicious directions within that content.
  • Capability inventory: The skill can execute shell commands, write to the platform-sensitive .claude/commands/ directory, and host a local web server.
  • Sanitization: No sanitization is performed on user queries or feedback strings before they are interpolated into agent prompts.
  • [EXTERNAL_DOWNLOADS]: The HTML viewer component (eval-viewer/viewer.html) loads an external spreadsheet processing library from cdn.sheetjs.com at runtime.
  • [COMMAND_EXECUTION]: The scripts/run_eval.py script explicitly removes the CLAUDECODE environment variable to bypass platform-enforced safety guards against recursive agent execution.
  • [DATA_EXFILTRATION]: The eval-viewer/generate_review.py script starts a local HTTP server that encodes and serves files from the skill's workspace to the browser, which could potentially expose sensitive data generated during evaluations if the workspace is misconfigured.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 13, 2026, 11:41 PM