testing-patterns

Fail

Audited by Gen Agent Trust Hub on Feb 15, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • PROMPT_INJECTION (HIGH): The skill is highly vulnerable to Indirect Prompt Injection. It ingests YAML specifications from the workspace that are treated as direct instructions for a sub-agent. Ingestion point: The run-tests.md command reads all YAML files in tests/specs/ using the find command. Boundary markers: None are used to separate test data from instructions; the agent is simply told to 'Read the spec' and 'Execute'. Capability inventory: The generated test-runner agent is granted high-privilege tools including Bash, WebFetch, and Read. Sanitization: No validation or escaping is performed on the command field within the test specs before passing it to the shell.
  • COMMAND_EXECUTION (HIGH): The system is designed to execute arbitrary shell commands provided in YAML files. The test-agent.md template uses forceful language ('CRITICAL', 'MUST') to compel the agent to execute these commands directly, which can override the agent's internal safety guidelines.
  • REMOTE_CODE_EXECUTION (HIGH): Since the agent is explicitly encouraged to use Bash and WebFetch to interact with 'live systems', an attacker can easily include a spec that downloads a malicious payload or exfiltrates sensitive environment variables (like API keys) to an external server.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 15, 2026, 08:25 PM