skill-creator

Pass

Audited by Gen Agent Trust Hub on Mar 5, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill invokes external CLI tools to perform its tasks.
  • scripts/run_eval.py and scripts/run_loop.py execute the claude CLI tool using subprocess.Popen to test skill triggering behavior.
  • eval-viewer/generate_review.py executes lsof via subprocess.run to identify and manage local network ports.
  • [EXTERNAL_DOWNLOADS]: The web-based evaluation viewer fetches external resources at runtime.
  • eval-viewer/viewer.html loads the SheetJS library from a public CDN (cdn.sheetjs.com) to render spreadsheet outputs within the browser.
  • [DATA_EXFILTRATION]: Local skill data is transmitted to external services for optimization.
  • scripts/improve_description.py and scripts/run_loop.py send the contents of the skill being developed, along with evaluation results, to the Anthropic API using the anthropic Python SDK. This is a primary function of the skill.
  • [INDIRECT_PROMPT_INJECTION]: The skill processes untrusted data which could contain malicious instructions designed to influence the agent's behavior during the skill creation process.
  • Ingestion points: The skill reads user-provided test cases from evals.json, skill definitions from SKILL.md files, and qualitative feedback from feedback.json.
  • Boundary markers: Instructions use XML-style delimiters (e.g., <skill_content>, <new_description>) to isolate untrusted data within prompts sent to the optimization engine.
  • Capability inventory: The skill has the ability to read and write files, execute local CLI commands, and perform network requests to the Anthropic API.
  • Sanitization: The skill employs html.escape in its reporting scripts and utilizes YAML block scalars in generated configurations to minimize formatting-based exploits.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 5, 2026, 05:19 PM