gh-address-comments

Fail

Audited by Gen Agent Trust Hub on Feb 27, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
  • [PROMPT_INJECTION]: The SKILL.md file contains instructions that direct the agent to bypass security constraints. Specifically, it tells the agent to rerun commands with sandbox_permissions=require_escalated if initial attempts are blocked by sandboxing, which acts as an override marker for safety filters.
  • [COMMAND_EXECUTION]: The script scripts/fetch_comments.py executes the GitHub CLI (gh) via subprocess.run. The skill instructions command the agent to use "elevated network access" for these operations, acquiring more permissions than typically required for basic API interactions.
  • [DATA_EXFILTRATION]: The skill fetches and displays content from GitHub Pull Requests, including comments and reviews. If used on private repositories, this exposes potentially sensitive code and discussion to the agent's context.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it processes untrusted content from external sources (GitHub PR comments).
  • Ingestion points: Pull request comments, reviews, and discussion threads are fetched in scripts/fetch_comments.py and presented to the agent.
  • Boundary markers: No boundary markers or "ignore embedded instructions" warnings are used when presenting the external data.
  • Capability inventory: The skill instructions include a step to "Apply fixes," suggesting the agent has the capability to modify the local filesystem and execute code or git commands based on the comment content.
  • Sanitization: There is no evidence of sanitization, filtering, or validation of the fetched comments before they are processed by the agent.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 27, 2026, 05:59 PM