skill-creator

Pass

Audited by Gen Agent Trust Hub on Mar 27, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/run_eval.py uses subprocess.Popen to execute the claude CLI for running evaluation queries against skill drafts. Additionally, eval-viewer/generate_review.py uses subprocess.run to call lsof for managing local network ports when starting the evaluation viewer server.
  • [EXTERNAL_DOWNLOADS]: The skill uses the anthropic Python client in scripts/improve_description.py and scripts/run_loop.py to communicate with Anthropic's API for generating and refining skill descriptions. The eval-viewer/viewer.html file also loads the SheetJS library from a public CDN (cdn.sheetjs.com) to render spreadsheet outputs.
  • [PROMPT_INJECTION]: The skill processes untrusted user-provided test prompts and execution transcripts through specialized agents like grader.md and analyzer.md. This architecture presents an indirect prompt injection surface where malicious content in a test case could attempt to influence the grading logic. The agents/grader.md instructions include specific defensive guidelines to ensure evidence for passing grades is substantive rather than superficial.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 27, 2026, 07:31 PM