webapp-testing

Warn

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION] (MEDIUM): The script scripts/with_server.py utilizes subprocess.Popen(..., shell=True) to execute commands provided via the --server argument. This pattern is vulnerable to shell command injection if the agent is induced to provide strings containing shell metacharacters.
  • [PROMPT_INJECTION] (LOW): The skill is susceptible to indirect prompt injection (Category 8) because it retrieves and acts upon untrusted data from web pages.
  • Ingestion points: Untrusted content is read using page.content() and button.inner_text() in examples/element_discovery.py and examples/console_logging.py.
  • Boundary markers: Absent; there are no instructions in SKILL.md or the examples to treat web-derived content as untrusted or to use delimiters to prevent instruction confusion.
  • Capability inventory: The agent can execute arbitrary shell commands via scripts/with_server.py and write files to the /mnt/user-data/outputs/ directory.
  • Sanitization: Absent; the content retrieved from the browser is processed directly by the agent without any validation or escaping.
  • [REMOTE_CODE_EXECUTION] (MEDIUM): The skill revolves around 'Dynamic Execution' (Category 10), where the agent generates and runs Python scripts (Playwright). While this is the intended functionality, it provides a high-capability attack surface that could be exploited to run arbitrary code in the host environment if the agent's logic is subverted.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 17, 2026, 06:12 PM