webapp-testing
Warn
Audited by Gen Agent Trust Hub on Feb 27, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTIONREMOTE_CODE_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The skill documentation includes explicit instructions to the agent to avoid reading the source code of the provided scripts, suggesting they be treated as "black boxes" to avoid context pollution. This instruction overrides standard security behaviors where the agent should verify the logic of executable scripts before running them.
- [COMMAND_EXECUTION]: The helper script
scripts/with_server.pyis designed to execute arbitrary shell commands provided via the--serverflag. This provides a direct mechanism for executing shell-level commands like starting web servers, but could be abused to execute malicious system commands if the input is manipulated. - [REMOTE_CODE_EXECUTION]: The skill's intended use case involves the agent writing and executing custom Python scripts using the Playwright library. While necessary for the skill's purpose, this represents a high-privilege capability for dynamic code execution in the host environment.
- [PROMPT_INJECTION]: The reconnaissance-then-action pattern requires the agent to read and identify selectors from rendered web content. This creates an indirect prompt injection surface where a malicious or compromised web application could include hidden instructions or deceptive elements in its HTML or console logs to influence the agent's behavior.
- Ingestion points: The agent ingests data via
page.content(),page.locator().all(), and browser console logs. - Boundary markers: There are no explicit boundary markers or instructions provided to the agent to treat ingested web content as untrusted data.
- Capability inventory: The agent has access to arbitrary command execution via
with_server.pyand local file writing via Playwright's screenshot functionality. - Sanitization: No sanitization or validation logic is specified for the data extracted from the target web applications before it is used to generate further automation logic.
Audit Metadata