webapp-testing

Fail

Audited by Gen Agent Trust Hub on Feb 22, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
  • [Unverifiable Dependencies & Remote Code Execution] (HIGH): The scripts/with_server.py utility uses subprocess.Popen(..., shell=True) to execute server commands provided as arguments. This is a severe vulnerability allowing for arbitrary command injection if user-controlled strings are passed as server commands. Additionally, the script executes any trailing command provided by the agent using subprocess.run().
  • [Indirect Prompt Injection] (LOW): The skill ingests untrusted data from web pages and console logs without sanitization or boundary markers. Mandatory Evidence: 1. Ingestion points: page.content() in examples/element_discovery.py and page.on("console", ...) in examples/console_logging.py. 2. Boundary markers: Absent. 3. Capability inventory: High-privilege shell execution via scripts/with_server.py. 4. Sanitization: Absent.
  • [Dynamic Execution] (MEDIUM): The primary purpose of the skill is the dynamic generation and execution of Python code (Playwright), which inherently grants the agent broad access to the underlying system.
  • [Data Exposure & Exfiltration] (LOW): Browser automation artifacts, including console logs and full-page screenshots, are saved to /mnt/user-data/outputs/ and /tmp/. These locations could be used to store sensitive data from visited sites for later exfiltration.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 22, 2026, 02:50 PM