webapp-testing

Warn

Audited by Gen Agent Trust Hub on Mar 1, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The helper script scripts/with_server.py uses subprocess.Popen with shell=True to execute arbitrary strings provided as server commands. This allows for the execution of complex shell commands, including chaining and redirection, which can be exploited if the input is influenced by an attacker.
  • [COMMAND_EXECUTION]: The core design of the skill encourages the agent to write and execute "native Python Playwright scripts." This gives the agent the capability to execute any arbitrary Python code, effectively bypassing typical constraints on fixed-tooling skills.
  • [PROMPT_INJECTION]: The "Reconnaissance-Then-Action" pattern in SKILL.md creates a significant surface for Indirect Prompt Injection. A malicious web application under test could serve content containing hidden instructions that the agent might extract and follow during its discovery phase.
  • [PROMPT_INJECTION]: Evidence Chain for Indirect Injection:
  • Ingestion points: page.goto(), page.content(), and DOM inspection methods found in SKILL.md, element_discovery.py, and console_logging.py.
  • Boundary markers: Absent. The agent is not instructed to use delimiters or ignore instructions found within the application content.
  • Capability inventory: Extensive capabilities including arbitrary shell command execution via with_server.py and arbitrary Python execution via script generation.
  • Sanitization: Absent. No filtering or validation is performed on the content retrieved from the web page before the agent uses it to decide its next action.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 1, 2026, 07:31 AM