test-triage

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION] (HIGH): The workflow explicitly instructs the agent to "Run the requested test command". This allows for arbitrary shell command execution within the environment.
  • [PROMPT_INJECTION] (HIGH): The skill is highly vulnerable to Indirect Prompt Injection (Category 8).
  • Ingestion points: The agent reads "failure output" and "relevant code paths" (SKILL.md).
  • Boundary markers: No delimiters or instructions to ignore embedded commands in the external data are present.
  • Capability inventory: The agent can run subprocesses via test commands and has the authority to "Implement the smallest sensible fix", which involves file writes (SKILL.md).
  • Sanitization: No sanitization or validation of the ingested test output or code content is mentioned.
  • Risk: An attacker could craft a malicious test failure message or code comment that, when processed by the agent, triggers unauthorized command execution or data exfiltration.
  • [DATA_EXPOSURE] (MEDIUM): By design, the skill inspects local source code and command outputs, which can inadvertently lead to the exposure of sensitive environment variables or hardcoded secrets to the agent's context during the triage process.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 12:38 PM