webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION] (MEDIUM): The skill contains a directive telling the agent 'DO NOT read the source until you try running the script first' and to treat scripts as 'black boxes'. This is an obfuscation-by-instruction tactic that discourages the agent from performing security audits of the code it executes.
- [COMMAND_EXECUTION] (HIGH): The
scripts/with_server.pyhelper is designed to take arbitrary strings and execute them as shell commands (e.g., via the--serverflag). This provides a direct path for executing unsanitized commands. - [REMOTE_CODE_EXECUTION] (HIGH): The core workflow requires the agent to 'write native Python Playwright scripts' and execute them. Any compromise of the agent's logic or input would lead to arbitrary Python code execution on the host.
- [DATA_EXPOSURE] (MEDIUM): The skill allows the agent to capture screenshots (
/tmp/inspect.png) and DOM content (page.content()) from local services. If the agent is directed to a sensitive local administrative panel, this data could be exposed or exfiltrated. - [INDIRECT_PROMPT_INJECTION] (HIGH):
- Ingestion points: The agent reads untrusted data via
page.content(),page.locator().all(), and browser console logs (console_logging.py). - Boundary markers: Absent. There are no instructions to ignore embedded commands in the web pages being tested.
- Capability inventory: The agent has the capability to execute shell commands (
with_server.py) and run generated Python scripts. - Sanitization: Absent. Data read from the browser is used to 'identify selectors' and 'execute actions' without validation.
- Risk: A malicious web page could contain hidden instructions that trick the agent into executing destructive shell commands or exfiltrating local files.
Recommendations
- AI detected serious security threats
Audit Metadata