finishing-a-development-branch

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • Prompt Injection (HIGH): Vulnerable to Indirect Prompt Injection (Category 8) due to the combination of high-privilege capabilities and untrusted data processing.
  • Ingestion points: Reads repository code and the stdout/stderr of test commands (e.g., npm test, pytest) in SKILL.md Step 1 and Step 4.
  • Boundary markers: Missing. There are no delimiters or instructions to ignore embedded commands when the agent summarizes changes for the PR body or evaluates test failures.
  • Capability inventory: Includes remote code pushing (git push), PR creation with metadata control (gh pr create), and local command execution (npm test).
  • Sanitization: None. Data from the repository is directly interpolated into shell commands and PR descriptions.
  • Command Execution (MEDIUM): The skill frequently executes shell commands to perform git operations and run test suites. While expected for a development tool, running commands like npm test executes arbitrary project-defined scripts, which poses a risk if the agent is working on an untrusted or compromised codebase.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 12:40 AM