webapp-testing

Pass

Audited by Gen Agent Trust Hub on Feb 27, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/with_server.py uses subprocess.Popen with shell=True to execute commands provided via the --server argument. It also executes the trailing command arguments via subprocess.run. This provides the agent with the capability to execute arbitrary system commands, which is the intended primary purpose of the server management helper but remains a high-privilege operation.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it ingests and processes untrusted data from external sources.
  • Ingestion points: The agent reads external content via page.goto(), page.content(), and page.on("console", ...) across examples/element_discovery.py, examples/console_logging.py, and SKILL.md logic.
  • Boundary markers: None. There are no delimiters or instructions provided to the agent to ignore potentially malicious commands embedded in the web pages or logs it inspects.
  • Capability inventory: The agent can execute arbitrary shell commands via scripts/with_server.py and write files to the /mnt/user-data/outputs/ directory.
  • Sanitization: None. The skill does not perform any validation or escaping of the web content or console logs before the agent uses them to identify selectors or determine subsequent automation steps.
  • [SAFE]: The recommendation in SKILL.md for the agent to use bundled scripts as "black boxes" and avoid reading their source code is noted. While this is presented as a context window optimization, it discourages the agent from verifying the underlying execution logic of the scripts it runs.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 27, 2026, 01:10 PM