gh-address-comments
Fail
Audited by Gen Agent Trust Hub on Feb 27, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
- [PROMPT_INJECTION]: The
SKILL.mdfile contains instructions that direct the agent to bypass security constraints. Specifically, it tells the agent to rerun commands withsandbox_permissions=require_escalatedif initial attempts are blocked by sandboxing, which acts as an override marker for safety filters. - [COMMAND_EXECUTION]: The script
scripts/fetch_comments.pyexecutes the GitHub CLI (gh) viasubprocess.run. The skill instructions command the agent to use "elevated network access" for these operations, acquiring more permissions than typically required for basic API interactions. - [DATA_EXFILTRATION]: The skill fetches and displays content from GitHub Pull Requests, including comments and reviews. If used on private repositories, this exposes potentially sensitive code and discussion to the agent's context.
- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it processes untrusted content from external sources (GitHub PR comments).
- Ingestion points: Pull request comments, reviews, and discussion threads are fetched in
scripts/fetch_comments.pyand presented to the agent. - Boundary markers: No boundary markers or "ignore embedded instructions" warnings are used when presenting the external data.
- Capability inventory: The skill instructions include a step to "Apply fixes," suggesting the agent has the capability to modify the local filesystem and execute code or git commands based on the comment content.
- Sanitization: There is no evidence of sanitization, filtering, or validation of the fetched comments before they are processed by the agent.
Recommendations
- AI detected serious security threats
Audit Metadata