webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 21, 2026
Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The helper script
scripts/with_server.pyutilizessubprocess.Popenwithshell=Trueto execute server commands andsubprocess.runto execute automation commands. This provides a direct path for arbitrary shell execution if the agent constructs commands based on untrusted user input. - [DATA_EXFILTRATION] (MEDIUM): The skill leverages Playwright to capture screenshots, DOM content, and console logs. Specifically,
examples/static_html_automation.pydemonstrates using thefile://protocol to access local files, which could be exploited to read sensitive system configuration or data files. - [PROMPT_INJECTION] (LOW): The skill is designed to ingest and process untrusted external data (web page content) during its 'Reconnaissance-Then-Action' phase. Malicious instructions embedded in HTML could trigger indirect prompt injection, although the impact is mitigated by the local nature of the testing toolkit.
- [METADATA_POISONING] (MEDIUM): In
SKILL.md, the instructions explicitly tell the agent 'DO NOT read the source' of the helper scripts until after execution. This adversarial instruction attempts to bypass the agent's ability to inspect potentially malicious logic within its own tools.
Recommendations
- AI detected serious security threats
Audit Metadata