webapp-testing
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (SAFE): The script
scripts/with_server.pyusessubprocess.Popenwithshell=Trueandsubprocess.runto manage local development servers and execute automation scripts. Whileshell=Trueis generally risky, it is used here as intended to support standard development workflows (e.g.,cd && npm run dev) on the user's local machine. - PROMPT_INJECTION (LOW): The skill implements an 'Indirect Prompt Injection' surface (Category 8). It ingests untrusted data from local web applications via Playwright's
page.content(),inner_text(), and screenshot capabilities. - Ingestion points:
examples/element_discovery.pyandSKILL.md(Reconnaissance pattern) read DOM content and screenshots. - Boundary markers: Absent. The skill does not explicitly instruct the agent to ignore instructions found within the HTML content.
- Capability inventory:
scripts/with_server.pyand the example scripts have full capability to execute shell commands and write to the file system (/mnt/user-data/outputs/). - Sanitization: None. Data from the page is directly printed or used to identify selectors for subsequent actions.
- Assessment: This is a known risk for any browser-based agent skill. The severity is LOW as it relies on the LLM's internal safety filters to ignore adversarial instructions embedded in web content.
Audit Metadata