review-pr
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION] (HIGH): The skill contains explicit instructions to override safety constraints and human-in-the-loop controls. Multiple directives such as 'no pausing for user confirmation', 'immediately reply... and resolve', and 'Don't wait for user input' are used to force the agent into an autonomous, unverified state.\n- [COMMAND_EXECUTION] (HIGH): The skill is directed to 'Implement the fix' autonomously. This grants the agent the power to modify files and execute git/shell commands based on PR comment content. Because PR comments are untrusted external data, this allows for the execution of arbitrary code or malicious repository modifications.\n- [PROMPT_INJECTION] (HIGH): Indirect Prompt Injection Surface Detection:\n
- Ingestion points: External Pull Request comments fetched via
gh apiin Step 2.\n - Boundary markers: Absent. The agent processes raw comment bodies without any isolation or 'ignore instructions' delimiters.\n
- Capability inventory: Ability to read/write files, execute
ghCLI for API interactions, and commit/push code changes.\n - Sanitization: Absent. There is no logic to filter or validate that the PR comment content does not contain instructions targeting the agent.
Recommendations
- AI detected serious security threats
Audit Metadata