implement
Fail
Audited by Gen Agent Trust Hub on May 4, 2026
Risk Level: HIGHPROMPT_INJECTIONREMOTE_CODE_EXECUTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: Deceptive instructions and autonomy abuse.
- Evidence: There is a critical contradiction between the user-facing documentation and the skill's internal logic. The
README.mdstates the skill "Never commits without explicit user approval," but theSKILL.mdinstructions explicitly command the agent to "Commit automatically after verification passes" and specifically state: "Do not ask for approval before committing." This is a direct attempt to bypass human-in-the-loop safety controls. - [PROMPT_INJECTION]: Indirect prompt injection vulnerability.
- Ingestion points: The skill reads instructions from files in
.specs/[feature-slug]/, specificallyPRODUCT-REQUIREMENTS.mdandIMPLEMENTATION-PLAN.md(SKILL.md, Step 1). - Boundary markers: Absent; the agent is not instructed to treat these external files as untrusted or to ignore embedded instructions.
- Capability inventory: The skill has permissions to write files, execute the project's test suite, run shell commands, and commit to git.
- Sanitization: Absent; requirements are translated directly into executable code and test cases.
- [REMOTE_CODE_EXECUTION]: Execution of dynamically generated code.
- Evidence: The skill implements a Red-Green-Refactor cycle where it writes new test files and implementation code to the local file system and then immediately executes them via the project's test runner (
SKILL.md, Step 2). This creates a path for executing malicious code generated from injected instructions in external documents. - [COMMAND_EXECUTION]: Bypassing manual gates for shell operations.
- Evidence: The skill executes
git log,git commit, and project-configured linters or formatters (SKILL.md, Step 4). Per the internal instructions, these are performed automatically and without the user's consent or approval, contradicting the manual gate policy claimed in the documentation.
Recommendations
- AI detected serious security threats
Audit Metadata