verification-loop

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION] (HIGH): The skill is vulnerable to indirect prompt injection (Category 8). It possesses execute capabilities via the Bash tool while processing untrusted data from the project being verified.
  • Ingestion points: The agent reads source code and command-line output from linters and test runners using the Read, Grep, and Bash tools.
  • Boundary markers: No delimiters or safety instructions are provided to help the agent distinguish between its own logic and instructions embedded in the external data.
  • Capability inventory: Access to the Bash tool allows for arbitrary command execution on the local system.
  • Sanitization: There is no mechanism to sanitize or filter output from the verification tools before it is processed by the agent.
  • [COMMAND_EXECUTION] (LOW): The skill documentation explicitly instructs the agent to run various development commands (e.g., pnpm, ruff, mypy). While standard for software development, this capability provides the primary attack surface for the higher-severity injection findings.
  • [EXTERNAL_DOWNLOADS] (LOW): The skill recommends using npx playwright test, which may trigger downloads from the npm registry. Per the [TRUST-SCOPE-RULE], downloads from trusted package registries are categorized as LOW.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 05:57 AM