adversarial-tester

Fail

Audited by Gen Agent Trust Hub on Mar 12, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill explicitly directs the agent to generate and then execute code locally (SKILL.md Step 5 and break-it-prompt.md Step 6). This workflow allows for arbitrary command execution on the host system via the generated test scripts.
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection because it ingests untrusted git diff output and uses it to drive reasoning and code generation.
  • Ingestion points: Implementation changes (git diffs) are ingested into the agent context via the break-it-prompt.md template.
  • Boundary markers: No delimiters or instructions are used to ensure the agent ignores malicious instructions embedded within the code changes or comments in the diff.
  • Capability inventory: The skill has the capability to write to the filesystem and execute scripts locally (SKILL.md Step 4 and 5).
  • Sanitization: The ingested diff data is not validated or sanitized before being processed to generate executable code.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 12, 2026, 10:34 PM