webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The script
scripts/with_server.pyusessubprocess.Popenwithshell=Trueto execute strings passed to the--serverargument. This is a classic command injection vulnerability if the input is influenced by untrusted data. - REMOTE_CODE_EXECUTION (HIGH): The skill is designed to have the agent write and execute arbitrary Python Playwright scripts. Combined with the
with_server.pyutility, this provides a direct path for the agent to execute arbitrary system commands and interact with the network. - PROMPT_INJECTION (MEDIUM): The
SKILL.mdfile contains a directive: "DO NOT read the source until you try running the script first". This instruction discourages the agent from performing security inspections of its own tools before execution, which is an adversarial pattern intended to hide malicious logic in scripts likewith_server.py. - INDIRECT_PROMPT_INJECTION (LOW): The skill's core function is web reconnaissance (using
page.content()andpage.screenshot()). This creates a vulnerability surface where a malicious web application could inject instructions into the agent's context. - Ingestion points:
page.goto(url)andpage.content()are used to bring external web data into the agent's reasoning loop (found inSKILL.mdandexamples/element_discovery.py). - Boundary markers: Absent. No instructions are provided to the agent to treat website content as untrusted or to use delimiters.
- Capability inventory: The agent can execute shell commands via
with_server.pyand perform file/network operations via Playwright and standard Python libraries. - Sanitization: Absent. There is no evidence of HTML sanitization or instruction filtering before the agent processes the DOM content.
- DATA_EXFILTRATION (LOW): The skill includes examples like
examples/console_logging.pythat write logs to/mnt/user-data/outputs/, which could be used to store sensitive information captured from local web applications.
Recommendations
- AI detected serious security threats
Audit Metadata