code-review

Warn

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: MEDIUMPROMPT_INJECTION
Full Analysis
  • Prompt Injection (MEDIUM): High risk of Indirect Prompt Injection (Category 8). 1. Ingestion points: The skill is designed to process external, attacker-controlled content including source code and pull request descriptions. 2. Boundary markers: None present in the instructions; the agent lacks specific guidance to treat embedded comments or strings as data rather than instructions. 3. Capability inventory: The skill produces external outputs in the form of PR approvals ('Approval means I would maintain this') and feedback, which significantly influences the development lifecycle. 4. Sanitization: There is no evidence of sanitization or filtering of input data. The persona's instruction to 'know when to let things slide' could be leveraged by an attacker using social engineering or embedded prompts to bypass security standards.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 16, 2026, 02:22 AM