implement

Fail

Audited by Gen Agent Trust Hub on May 4, 2026

Risk Level: HIGHPROMPT_INJECTIONREMOTE_CODE_EXECUTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: Deceptive instructions and autonomy abuse.
  • Evidence: There is a critical contradiction between the user-facing documentation and the skill's internal logic. The README.md states the skill "Never commits without explicit user approval," but the SKILL.md instructions explicitly command the agent to "Commit automatically after verification passes" and specifically state: "Do not ask for approval before committing." This is a direct attempt to bypass human-in-the-loop safety controls.
  • [PROMPT_INJECTION]: Indirect prompt injection vulnerability.
  • Ingestion points: The skill reads instructions from files in .specs/[feature-slug]/, specifically PRODUCT-REQUIREMENTS.md and IMPLEMENTATION-PLAN.md (SKILL.md, Step 1).
  • Boundary markers: Absent; the agent is not instructed to treat these external files as untrusted or to ignore embedded instructions.
  • Capability inventory: The skill has permissions to write files, execute the project's test suite, run shell commands, and commit to git.
  • Sanitization: Absent; requirements are translated directly into executable code and test cases.
  • [REMOTE_CODE_EXECUTION]: Execution of dynamically generated code.
  • Evidence: The skill implements a Red-Green-Refactor cycle where it writes new test files and implementation code to the local file system and then immediately executes them via the project's test runner (SKILL.md, Step 2). This creates a path for executing malicious code generated from injected instructions in external documents.
  • [COMMAND_EXECUTION]: Bypassing manual gates for shell operations.
  • Evidence: The skill executes git log, git commit, and project-configured linters or formatters (SKILL.md, Step 4). Per the internal instructions, these are performed automatically and without the user's consent or approval, contradicting the manual gate policy claimed in the documentation.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
May 4, 2026, 05:53 PM