skill-creator

Pass

Audited by Gen Agent Trust Hub on Mar 12, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection due to its core workflow of processing external data to generate or modify agent instructions.
  • Ingestion points: The skill reads user-provided test prompts from 'evals/evals.json' and human review comments from 'feedback.json'.
  • Boundary markers: There are no explicit boundary markers or instructions in the generated command files ('.claude/commands/*.md') to ignore potential instructions embedded within the test data.
  • Capability inventory: The skill possesses significant capabilities, including file system write access ('scripts/run_eval.py', 'scripts/run_loop.py') and the ability to execute CLI tools and scripts via subprocesses ('scripts/run_eval.py').
  • Sanitization: While the skill performs basic formatting, it does not implement robust sanitization or escaping of the user-provided test strings before they are incorporated into the agent's context.
  • [COMMAND_EXECUTION]: The skill frequently executes system commands to facilitate its workflows. It uses 'subprocess' to run the 'claude' CLI for triggering evaluations and executes internal Python scripts for benchmarking, packaging, and validation ('scripts/aggregate_benchmark.py', 'scripts/package_skill.py', 'scripts/run_loop.py'). These operations are consistent with the skill's documented purpose as a meta-development tool.
  • [EXTERNAL_DOWNLOADS]: The evaluation viewer component ('eval-viewer/viewer.html') references a script from the SheetJS CDN ('cdn.sheetjs.com') to enable spreadsheet rendering for the user. This reference targets a well-known service for data processing.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 12, 2026, 04:53 PM