skill-creator

Pass

Audited by Gen Agent Trust Hub on Mar 11, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE]: The skill serves as a development tool for AI agent skills. It includes several Python scripts for aggregating benchmark results, generating HTML review reports, and optimizing skill descriptions through iterative testing.
  • [COMMAND_EXECUTION]: The skill uses subprocess.run and subprocess.Popen in several scripts (e.g., run_eval.py, run_loop.py, improve_description.py) to execute the claude CLI. This is the intended mechanism for running evaluations and generating content using the agent's own capabilities.
  • [EXTERNAL_DOWNLOADS]: The eval-viewer/viewer.html file includes a reference to a well-known CDN for the SheetJS library (cdn.sheetjs.com) to enable spreadsheet rendering in the browser-based review tool. This is a standard practice for web-based data visualization.
  • [REMOTE_CODE_EXECUTION]: While the skill spawns subagents to run test prompts, it provides explicit warnings and instructions for the user to evaluate these runs. There is no evidence of arbitrary or unauthorized remote code execution; all execution is part of the user-initiated skill development workflow.
  • [DATA_EXFILTRATION]: The skill handles local files (e.g., feedback.json, evals.json) within the project workspace to manage the development state. It does not perform any unauthorized external network operations.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 11, 2026, 07:34 AM