review-verification-protocol
Fail
Audited by Gen Agent Trust Hub on Feb 15, 2026
Risk Level: HIGHPROMPT_INJECTIONNO_CODE
Full Analysis
- Prompt Injection (HIGH): The skill employs high-priority directives ('MUST', 'Mandatory', 'CRITICAL') to override the agent's behavior. It explicitly commands the agent to 'ignore informational items entirely' and 'remove' findings from its report if uncertain, which disables the agent's independent security evaluation. Evidence: 'The Verdict should ignore informational items entirely', 'If uncertain about any finding... Remove it from the review', 'Load this skill before reporting ANY code review findings'.\n- Indirect Prompt Injection (HIGH): The skill defines the logic for processing untrusted codebase content without boundary markers or sanitization. It provides a predefined list of 'Valid Patterns' that must not be flagged, creating a mechanism where an attacker could mask malicious logic by mimicking these 'safe' patterns. \n * Ingestion points: Source code files, diffs, and project configuration files (e.g., CLAUDE.md).\n * Boundary markers: Absent.\n * Capability inventory: Reporting findings and influencing 'Verdicts' for code approval/merge decisions.\n * Sanitization: Absent.
Recommendations
- AI detected serious security threats
Audit Metadata