adversarial-tester
Fail
Audited by Gen Agent Trust Hub on Mar 12, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill explicitly directs the agent to generate and then execute code locally (SKILL.md Step 5 and break-it-prompt.md Step 6). This workflow allows for arbitrary command execution on the host system via the generated test scripts.
- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection because it ingests untrusted
git diffoutput and uses it to drive reasoning and code generation. - Ingestion points: Implementation changes (git diffs) are ingested into the agent context via the break-it-prompt.md template.
- Boundary markers: No delimiters or instructions are used to ensure the agent ignores malicious instructions embedded within the code changes or comments in the diff.
- Capability inventory: The skill has the capability to write to the filesystem and execute scripts locally (SKILL.md Step 4 and 5).
- Sanitization: The ingested diff data is not validated or sanitized before being processed to generate executable code.
Recommendations
- AI detected serious security threats
Audit Metadata