skills-factory

Pass

Audited by Gen Agent Trust Hub on Apr 2, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [COMMAND_EXECUTION]: Several scripts provided with the skill (run_eval.py, run_loop.py, improve_description.py) utilize the Python subprocess module to execute the claude CLI and other system tools (e.g., lsof to manage local ports). These operations are part of the skill's intended functionality as an automated evaluation and optimization framework.
  • [PROMPT_INJECTION]: The skill possesses a vulnerability surface for indirect prompt injection. It is designed to ingest and process untrusted data from evals/evals.json and evals/trigger-eval.json and interpolate these strings directly into prompts for the claude CLI.
  • Ingestion points: Test prompts and trigger queries are read from the evals/ directory.
  • Boundary markers: The subagent prompt templates documented in SKILL.md (Step 6a) lack explicit boundary markers or specific instructions for the model to ignore embedded malicious instructions within the test cases.
  • Capability inventory: The skill's scripts can execute arbitrary shell commands via subprocess, access the filesystem, and start a local HTTP server.
  • Sanitization: No input validation or sanitization is performed on the content of the prompt or query fields before they are passed to the model execution context.
  • [EXTERNAL_DOWNLOADS]: The eval-viewer/viewer.html component includes a script tag that loads the sheetjs library from cdn.sheetjs.com. This is a well-known service used to provide spreadsheet rendering capabilities in the local review interface.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 2, 2026, 03:00 PM