webapp-testing

Warn

Audited by Gen Agent Trust Hub on Mar 3, 2026

Risk Level: MEDIUMCOMMAND_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION]: The helper script scripts/with_server.py facilitates the execution of arbitrary shell commands provided through its command-line interface.
  • Evidence: The script uses subprocess.Popen(server['cmd'], shell=True) to start background servers and subprocess.run(args.command) to execute the primary automation command.
  • The use of shell=True increases the risk of command injection if the input strings (e.g., server start commands) are derived from untrusted sources.
  • [INDIRECT_PROMPT_INJECTION]: The skill is designed to have the agent ingest and process data from local web applications, which serves as a significant attack surface.
  • Ingestion points: The agent is instructed in SKILL.md and examples/element_discovery.py to read the rendered DOM (page.content(), page.locator().all()) and in examples/console_logging.py to capture and process browser console logs.
  • Boundary markers: No explicit boundary markers or 'ignore' instructions are provided to separate the untrusted web content from the agent's internal instructions.
  • Capability inventory: The agent has the capability to execute arbitrary shell commands via the scripts/with_server.py utility.
  • Sanitization: There is no evidence of sanitization or validation of the content retrieved from the web applications before the agent processes it.
  • Risk: An attacker-controlled or compromised web application could embed malicious instructions within HTML elements or console logs that the agent might interpret as valid commands.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 3, 2026, 12:06 PM