run-test-plan
Pass
Audited by Gen Agent Trust Hub on Apr 9, 2026
Risk Level: SAFECOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
- [COMMAND_EXECUTION]: The skill reads a YAML test plan from
docs/testing/test-plan.yamland executes shell commands specified in thesetup.commandsfield. This allows for the execution of arbitrary commands defined within a local configuration file. - [COMMAND_EXECUTION]: During the test execution phase (Step 4b), the skill runs
curlcommands andagent-browserCLI actions using parameters (methods, URLs, headers, and bodies) sourced directly from the YAML test plan. - [DATA_EXFILTRATION]: The skill performs
curlrequests to URLs defined in the test plan. A maliciously crafted test plan could point these requests to external attacker-controlled servers to exfiltrate data. - [DATA_EXFILTRATION]: Upon test failure (Step 6), the skill executes
git diffto identify changes and captures the output, along with potentially sensitive API responses, into a failure report file (docs/testing/evidence/<test.id>-failure.md). - [COMMAND_EXECUTION]: The skill has a significant attack surface for indirect prompt injection via the
docs/testing/test-plan.yamlfile. - Ingestion points: The YAML test plan is read in Step 1.
- Boundary markers: None; the skill lacks delimiters or instructions to ignore embedded commands in the YAML.
- Capability inventory: Execution of shell commands, background process management (
nohup), network requests (curl), and browser automation (agent-browser). - Sanitization: While the skill uses
yaml.safe_loadto parse the file, it performs no validation or sanitization of the commands or URLs contained within it before execution.
Audit Metadata