NYC

code-no-test

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • PROMPT_INJECTION (HIGH): The skill exhibits a significant vulnerability to Indirect Prompt Injection (Category 8) by processing untrusted data to drive the workflow sequence.
  • Ingestion points: The skill reads implementation plans from the $ARGUMENTS variable and searches the ./plans/ directory for plan.md files to extract actionable tasks.
  • Boundary markers: While it uses <plan> tags for arguments, these provide no protection against instructions embedded within the plan content that aim to override the agent's behavior.
  • Capability inventory: The skill possesses extensive write and execute capabilities, including modifying source code files, calling other subagents, executing imagemagick commands, and performing automated git commits.
  • Sanitization: No sanitization or validation is performed on the plan's content. The skill blindly trust the 'tasks' it extracts during Step 1 and initializes them in the TodoWrite system.
  • COMMAND_EXECUTION (MEDIUM): The skill executes shell commands for plan discovery (ls -t) and code verification (compile to verify). These operations are susceptible to manipulation if a malicious plan successfully injects content that influences file paths or build instructions.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 09:33 AM