branch-evaluator

Fail

Audited by Gen Agent Trust Hub on Mar 13, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [COMMAND_EXECUTION]: Phase 3b of the evaluation workflow in SKILL.md directs the agent to checkout candidate branches and run 'the project's test command'. Since the definition of the test command and its environment are controlled by the branch content (e.g., scripts in package.json, Makefile, or test configuration files), an attacker can supply a malicious branch that executes arbitrary shell commands on the host system.
  • [REMOTE_CODE_EXECUTION]: The skill facilitates the execution of code from remote, potentially untrusted git branches. If a user points the skill at a malicious repository or branch, the agent will execute any code contained in the branch's test suite during the evaluation process.
  • [PROMPT_INJECTION]: The skill ingests untrusted data from external URLs used for 'Reference implementation plans' and from the content of candidate branches. This data is used to generate logic, checklists, and reports without proper sanitization or boundary markers. 1. Ingestion points: Reference implementation plan (URL or file) and branch contents (diffs and files). 2. Boundary markers: Absent. 3. Capability inventory: Shell execution for git operations and project-defined test scripts. 4. Sanitization: Absent.
  • [EXTERNAL_DOWNLOADS]: The skill performs network operations to fetch git data and retrieve reference plans from user-provided URLs. Fetching content from unverified external sources to guide agent behavior increases the risk of processing malicious instructions or payloads.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 13, 2026, 01:37 PM