webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 22, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
- [Unverifiable Dependencies & Remote Code Execution] (HIGH): The
scripts/with_server.pyutility usessubprocess.Popen(..., shell=True)to execute server commands provided as arguments. This is a severe vulnerability allowing for arbitrary command injection if user-controlled strings are passed as server commands. Additionally, the script executes any trailing command provided by the agent usingsubprocess.run(). - [Indirect Prompt Injection] (LOW): The skill ingests untrusted data from web pages and console logs without sanitization or boundary markers. Mandatory Evidence: 1. Ingestion points:
page.content()inexamples/element_discovery.pyandpage.on("console", ...)inexamples/console_logging.py. 2. Boundary markers: Absent. 3. Capability inventory: High-privilege shell execution viascripts/with_server.py. 4. Sanitization: Absent. - [Dynamic Execution] (MEDIUM): The primary purpose of the skill is the dynamic generation and execution of Python code (Playwright), which inherently grants the agent broad access to the underlying system.
- [Data Exposure & Exfiltration] (LOW): Browser automation artifacts, including console logs and full-page screenshots, are saved to
/mnt/user-data/outputs/and/tmp/. These locations could be used to store sensitive data from visited sites for later exfiltration.
Recommendations
- AI detected serious security threats
Audit Metadata