code-review
Fail
Audited by Gen Agent Trust Hub on Feb 15, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION] (HIGH): The skill is highly vulnerable to Indirect Prompt Injection. It requires the agent to ingest feedback from 'External Reviewers' in
references/code-review-reception.mdand subsequently 'IDENTIFY' and 'RUN' verification commands inreferences/verification-before-completion.md. An attacker acting as a reviewer could provide malicious commands disguised as verification steps, which the agent would execute without sanitization. - [COMMAND_EXECUTION] (MEDIUM): The skill explicitly commands the agent to execute arbitrary shell commands for tests and builds, as well as git operations (
git rev-parse,git log). This provides the necessary primitive for an Indirect Prompt Injection attack to escalate to remote code execution within the agent's environment. - [PROMPT_INJECTION] (LOW): The skill utilizes forceful 'Iron Law' steering and specific 'canary' signaling phrases like 'Strange things are afoot at the Circle K' to override standard agent behavior. While intended for quality control, these establish a framework that could be co-opted for stealthy exfiltration or behavioral manipulation.
Recommendations
- AI detected serious security threats
Audit Metadata