finishing-a-development-branch
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The skill explicitly instructs the agent to run project-specific test suites such as
npm test,cargo test, andpytest. These commands often execute arbitrary code defined in configuration files (likepackage.jsonortox.ini) within the repository being worked on. - [PROMPT_INJECTION] (HIGH): The skill is highly susceptible to indirect prompt injection. It processes untrusted data (source code and test outputs) and uses that information to make decisions in Step 3. A malicious codebase could generate crafted test failures or include instructions in code comments that hijack the agent's decision-making process.
- Ingestion points: Project source code, test definitions, and shell command outputs in SKILL.md (Step 1 and Step 4).
- Boundary markers: Absent. The agent is not instructed to ignore embedded instructions within the data it processes.
- Capability inventory: Subprocess execution (test commands), file system modification (git merge/branch), and network operations (git push, gh pr create).
- Sanitization: Absent. There is no escaping or validation of the content processed from the repository.
- [DATA_EXFILTRATION] (MEDIUM): The skill performs network operations via
git pushandgh pr create. While these are standard developer tools, they provide a mechanism to exfiltrate data to remote servers if the agent is directed to a malicious origin.
Recommendations
- AI detected serious security threats
Audit Metadata