skill-creator

Pass

Audited by Gen Agent Trust Hub on Apr 7, 2026

Risk Level: SAFE
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes local commands using Python's subprocess module in scripts like run_eval.py, improve_description.py, and generate_review.py. These calls are used to invoke the claude CLI for testing, manage background processes, and check system status (e.g., using lsof to manage the viewer's port). This is a standard pattern for the skill's documented purpose.
  • [EXTERNAL_DOWNLOADS]: The viewer.html file includes a reference to the SheetJS library via a public CDN (cdn.sheetjs.com). This is used to render spreadsheet files within the evaluation viewer and is a well-known service for this purpose.
  • [INDIRECT_PROMPT_INJECTION]: The skill acts as an orchestration layer that processes untrusted inputs from user-defined evals.json and feedback files.
  • Ingestion points: scripts/run_eval.py reads user task prompts from the eval set; scripts/run_loop.py and scripts/improve_description.py process evaluation results and feedback to refine descriptions.
  • Boundary markers: The skill uses XML-style tags (e.g., <skill_content>, <new_description>) to isolate untrusted content when communicating with the agent during the improvement loop.
  • Capability inventory: The skill can execute shell commands, perform local file operations, and manage network ports (localhost only).
  • Sanitization: While the skill primarily executes the provided prompts for testing purposes, it employs structured delimiters to maintain context separation and prevent unintended instruction following.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 7, 2026, 07:55 PM