skills/frizzle-chan/mudd/healtests/Gen Agent Trust Hub

healtests

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • Command Execution (MEDIUM): The skill executes just testq in the user's local environment. While intended for testing, this grants the agent the ability to run arbitrary code defined in the project's task runner.
  • Indirect Prompt Injection (HIGH): The skill is designed to ingest and analyze untrusted data (test failure output) and has significant write and execute capabilities. Evidence Chain: 1. Ingestion point: Step 3 instructs the agent to 'Analyze the test failure output'. 2. Boundary markers: Absent; there are no delimiters or instructions to ignore embedded commands within the test results. 3. Capability inventory: The skill can execute shell commands via just and perform planning for file modifications using EnterPlanMode. 4. Sanitization: Absent; the agent directly processes the raw output of the failing tests. An attacker could craft a test that, upon failure, emits instructions that the agent would then follow.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 01:33 PM