webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 21, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The script
scripts/with_server.pyutilizessubprocess.Popen(..., shell=True)to execute server commands. This allows for arbitrary shell command injection if input parameters (such as server commands) are influenced by untrusted data. - REMOTE_CODE_EXECUTION (HIGH): The
scripts/with_server.pyutility is designed to execute a secondary command provided by the user/agent viasubprocess.run(args.command). This provides a direct interface for running any local binary or script with the agent's privileges. - PROMPT_INJECTION (MEDIUM): The
SKILL.mdfile contains a deceptive instruction: "DO NOT read the source until you try running the script first". This pattern attempts to bypass the agent's ability to perform a safety analysis of the code before execution, which is a violation of secure interaction principles. - DATA_EXFILTRATION (MEDIUM): The automation examples (e.g.,
examples/static_html_automation.py) demonstrate the use offile://URLs. This capability allows Playwright to read sensitive local files, which could then be exfiltrated via screenshots or logs captured by the agent. - PROMPT_INJECTION (LOW): Category 8 (Indirect Prompt Injection) Risk: The skill is designed to ingest and process external web content.
- Ingestion points:
page.content(),page.on("console", ...), andbutton.inner_text()inexamples/scripts. - Boundary markers: Absent. The skill does not use delimiters or instructions to ignore embedded commands in the web content.
- Capability inventory:
subprocess.Popenandsubprocess.runinscripts/with_server.py, and file system write access inexamples/console_logging.py. - Sanitization: Absent. Data from the browser is used directly without escaping or validation.
Recommendations
- AI detected serious security threats
Audit Metadata