generate-tests
Fail
Audited by Gen Agent Trust Hub on Apr 8, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
- [COMMAND_EXECUTION]: The skill executes shell commands using
npx playwright testinSKILL.mdandfix-prompt.md. The path to the test file is derived fromtest-plan.md, which is a local file that could be modified by previous agent steps or external processes. - [COMMAND_EXECUTION]: In
fix-prompt.md, the skill usescurl [BASE_URL]to check the status of an application. This involves network interaction based on an interpolated variable. - [REMOTE_CODE_EXECUTION]: The skill implements a dynamic code generation pattern in Phase 1 of
SKILL.md. It creates a TypeScript Playwright test file by interpolating selectors and URLs fromdocs/playwright-spec-testing/exploration/[SCENARIO_SLUG].mddirectly into a code template. The instructions explicitly state to use selectors 'EXACTLY' as they appear in the report without paraphrasing. If the exploration data (captured from an external web application) contains malicious payload strings designed to break out of TypeScript string literals (e.g.,'); require('child_process').exec('...'); //), this code will be executed with the privileges of the agent during Phase 2. - [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection (Category 8).
- Ingestion points: Data enters the context from
docs/playwright-spec-testing/exploration/[SCENARIO_SLUG].mdanddocs/playwright-spec-testing/test-plan.md. - Boundary markers: None are present; the skill treats content from these files as authoritative instructions for selector names and test steps.
- Capability inventory: The agent has the capability to write files (
Phase 1) and execute them via the shell (Phase 2). - Sanitization: There is no evidence of sanitization or escaping of the strings read from the exploration reports before they are interpolated into the test script.
Recommendations
- AI detected serious security threats
Audit Metadata