webapp-testing

Warn

Audited by Gen Agent Trust Hub on Mar 5, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/with_server.py uses subprocess.Popen(shell=True) and subprocess.run() to execute server commands and automation scripts provided as command-line arguments. While intended for local development and testing, this pattern allows for the execution of any system command.
  • [REMOTE_CODE_EXECUTION]: The skill is designed to write and execute native Python Playwright scripts at runtime. While these scripts are generated by the agent based on the user's local files, the ability to dynamically create and run executable code is a high-privilege operation.
  • [DATA_EXFILTRATION]: The skill captures browser screenshots and console logs, saving them to /tmp/ and /mnt/user-data/outputs/. If the agent were directed to navigate to a site containing sensitive information, these files would contain that data.
  • [INDIRECT_PROMPT_INJECTION]: The skill's primary function is to browse and interact with web applications. It implements a 'Reconnaissance-Then-Action' pattern where it reads the DOM content of a page to identify selectors. This provides a surface where malicious instructions embedded in a web page (e.g., in hidden HTML elements or console logs) could potentially influence the agent's subsequent actions.
  • Ingestion points: page.content(), page.locator().all(), and page.on("console", ...) in examples/element_discovery.py and examples/console_logging.py.
  • Boundary markers: None identified in the provided scripts or instructions.
  • Capability inventory: Full shell command execution via scripts/with_server.py and file writing capabilities.
  • Sanitization: None identified; the skill directly processes page content and console output to determine the next steps in automation.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 5, 2026, 01:13 AM