receiving-code-review

Fail

Audited by Gen Agent Trust Hub on Feb 13, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION] (HIGH): The skill is highly vulnerable to Indirect Prompt Injection (Category 8) as it is designed to process untrusted external content and perform high-privilege actions based on that content.
  • Ingestion points: External reviewer suggestions, GitHub PR comments, and terminal-based feedback as described in the 'Source-Specific Handling' and 'GitHub Thread Replies' sections.
  • Boundary markers: Absent. There are no instructions to wrap external input in delimiters or to treat the text as data rather than instructions.
  • Capability inventory: The skill explicitly authorizes 'Implementation' (file modification/write) and network operations via 'gh api' calls to GitHub repositories.
  • Sanitization: The skill relies on behavioral 'verification' and 'skepticism' rather than technical sanitization. While it instructs the agent to 'Verify before implementing' and 'Push back with reasoning,' an adversarial reviewer could still craft instructions that bypass these logic checks or exploit the agent's decision-making process.
  • [COMMAND_EXECUTION] (MEDIUM): The skill directs the agent to execute system commands such as 'grep' and 'gh api' based on the evaluation of external feedback. The path to execution is mediated by the agent's interpretation of the feedback, which increases the risk of malicious command injection if the agent is successfully manipulated.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 13, 2026, 10:38 PM