healtests
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- Command Execution (MEDIUM): The skill executes
just testqin the user's local environment. While intended for testing, this grants the agent the ability to run arbitrary code defined in the project's task runner. - Indirect Prompt Injection (HIGH): The skill is designed to ingest and analyze untrusted data (test failure output) and has significant write and execute capabilities. Evidence Chain: 1. Ingestion point: Step 3 instructs the agent to 'Analyze the test failure output'. 2. Boundary markers: Absent; there are no delimiters or instructions to ignore embedded commands within the test results. 3. Capability inventory: The skill can execute shell commands via
justand perform planning for file modifications usingEnterPlanMode. 4. Sanitization: Absent; the agent directly processes the raw output of the failing tests. An attacker could craft a test that, upon failure, emits instructions that the agent would then follow.
Recommendations
- AI detected serious security threats
Audit Metadata