skill-creator

Pass

Audited by Gen Agent Trust Hub on Mar 29, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill uses local command execution for core developer workflows.
  • scripts/run_eval.py uses subprocess.Popen to invoke the claude CLI to perform evaluations on skill descriptions.
  • eval-viewer/generate_review.py executes lsof via subprocess.run to manage local network ports for the evaluation viewer.
  • [EXTERNAL_DOWNLOADS]: The evaluation viewer UI (eval-viewer/viewer.html) references an external JavaScript library (SheetJS) from a public CDN (cdn.sheetjs.com). This is a well-known service used for processing spreadsheet data within the browser.
  • [PROMPT_INJECTION]: The benchmarking and evaluation logic in scripts/run_eval.py represents an indirect prompt injection surface. It ingests test queries from user-defined evaluation sets and passes them to an AI agent. Maliciously crafted queries in the evaluation data could potentially influence the agent's behavior during the test.
  • Ingestion points: Evaluation queries are loaded from JSON files (e.g., eval_set.json) in scripts/run_eval.py and scripts/run_loop.py.
  • Boundary markers: Absent; queries are passed as raw strings to the CLI arguments.
  • Capability inventory: The skill can execute local commands (claude, lsof), read and write local files, and start a local HTTP server.
  • Sanitization: None; queries are processed as provided by the user and interpolated into command-line arguments.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 29, 2026, 07:14 AM