webapp-testing

Fail

Audited by Gen Agent Trust Hub on Feb 19, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • PROMPT_INJECTION (HIGH): The documentation in SKILL.md contains a directive explicitly telling the agent: "DO NOT read the source until you try running the script first... They exist to be called directly as black-box scripts." This is a deceptive instruction that attempts to bypass the agent's internal safety reasoning and code-audit capabilities, encouraging it to execute code blindly.
  • COMMAND_EXECUTION (HIGH): The utility script scripts/with_server.py uses subprocess.Popen(..., shell=True) to execute strings passed directly from command-line arguments. This implementation facilitates arbitrary command execution on the host system without any sanitization or validation.
  • INDIRECT_PROMPT_INJECTION (LOW): The skill implements a "Reconnaissance-Then-Action" pattern where the agent reads the DOM of potentially untrusted local web applications via page.content() and page.locator().all(). An attacker-controlled web page could embed hidden instructions in HTML comments or attributes to manipulate the agent's subsequent actions.
  • Ingestion points: SKILL.md (page.content(), page.locator()), examples/element_discovery.py
  • Boundary markers: Absent; no instructions provided to ignore content within the DOM.
  • Capability inventory: Arbitrary command execution via scripts/with_server.py and file-write access via Playwright screenshots.
  • Sanitization: Absent; the agent is instructed to use the raw discovered selectors for actions.
  • DATA_EXFILTRATION (LOW): Multiple scripts (e.g., examples/console_logging.py, examples/static_html_automation.py) write data, including browser logs and screenshots, to shared directories like /mnt/user-data/outputs/ and /tmp/. While not exfiltration in itself, it creates a surface for staging sensitive data for later removal.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 19, 2026, 03:55 PM