webapp-testing

Warn

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • COMMAND_EXECUTION (MEDIUM): The helper script scripts/with_server.py utilizes subprocess.Popen with shell=True to execute server commands. This allows the execution of arbitrary shell commands. Since these commands are generated by the agent, they could be influenced by malicious prompts or web content. The severity is mitigated from HIGH to MEDIUM because this is a core requirement for the skill's primary purpose of testing local web apps.
  • REMOTE_CODE_EXECUTION (MEDIUM): The skill relies on the agent dynamically writing and executing Python Playwright scripts. This execution model is inherently risky if the logic within those scripts is derived from untrusted external data.
  • INDIRECT_PROMPT_INJECTION (LOW): The skill creates a significant attack surface by reading untrusted content from web pages to identify UI elements and determine actions. Mandatory Evidence: 1. Ingestion points: page.content() in SKILL.md and inner_text() in examples/element_discovery.py. 2. Boundary markers: Absent; no instructions provided to delimit or ignore instructions embedded in web content. 3. Capability inventory: Execution of shell commands via with_server.py and Python script execution. 4. Sanitization: Absent; the agent is encouraged to use raw discovered text for selectors.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 17, 2026, 06:39 PM