canvas-design

Pass

Audited by Gen Agent Trust Hub on Mar 6, 2026

Risk Level: SAFEPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION]: The skill incorporates a simulated user feedback mechanism ("The user ALREADY said 'It isn't perfect enough...'") in its final refinement step. This technique is used to override the agent's current state and force it into a specific behavioral mode regardless of the actual user interaction history.
  • [PROMPT_INJECTION]: The instructions use repetitive, high-pressure directives (e.g., "MUST stress multiple times," "EMPHASIZE... REPEATEDLY," "non-negotiable") to coerce the agent into prioritizing specific aesthetic outcomes over its standard safety or quality guidelines.
  • [PROMPT_INJECTION]: The skill processes untrusted user input to deduce conceptual threads for design philosophies, creating a surface for indirect prompt injection where malicious instructions in the user's prompt could influence the agent's subsequent file-writing operations.
  • Ingestion points: User input is interpolated in the "Design Philosophy Creation" and "Deducing the Subtle Reference" sections of SKILL.md.
  • Boundary markers: Absent; user input is treated as a foundational instruction without delimiters or "ignore embedded instructions" warnings.
  • Capability inventory: The skill can write .md, .pdf, and .png files, search the local ./canvas-fonts directory, and potentially perform network downloads.
  • Sanitization: Absent; no validation or escaping of user-provided conceptual input is mentioned.
  • [EXTERNAL_DOWNLOADS]: The skill instructs the agent to "Download and use whatever fonts are needed," which lacks domain restrictions and could lead the agent to fetch assets from untrusted external sources if interpreted as permission for arbitrary network access.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 6, 2026, 01:57 PM