skill-creator

Warn

Audited by Gen Agent Trust Hub on Mar 7, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
  • [COMMAND_EXECUTION]: The scripts scripts/run_eval.py and scripts/run_loop.py use subprocess.Popen to call the claude command-line tool. This execution incorporates user-defined evaluation queries directly as arguments to subagents, creating a surface where external data influences CLI operations.
  • [EXTERNAL_DOWNLOADS]: The skill integrates the anthropic Python package for description optimization. Additionally, the eval-viewer/viewer.html component fetches the SheetJS library from a public CDN (cdn.sheetjs.com) to enable spreadsheet rendering within the local review interface.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it ingests and processes test cases from evals/evals.json and evals/trigger-eval.json. These queries are passed to the claude CLI without sanitization or boundary markers.
  • Ingestion points: evals/evals.json, evals/trigger-eval.json
  • Boundary markers: Absent
  • Capability inventory: CLI execution via subprocess, file writing to local command directories, and local web server hosting
  • Sanitization: None
  • [COMMAND_EXECUTION]: The scripts/run_eval.py script dynamically writes skill definitions to the .claude/commands/ directory. These definitions are subsequently loaded as active skills by the environment during evaluation runs.
  • [COMMAND_EXECUTION]: To manage port availability, eval-viewer/generate_review.py executes the lsof utility and uses os.kill to terminate existing processes on the target port.
  • [DATA_EXFILTRATION]: The eval-viewer/generate_review.py script initiates a local HTTP server (defaulting to port 3117) to serve a review UI. It exposes files from the workspace directory and provides a POST endpoint that writes user feedback to local JSON files.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 7, 2026, 03:53 AM