webapp-testing

Fail

Audited by Gen Agent Trust Hub on Feb 27, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/with_server.py uses subprocess.Popen with shell=True to execute commands provided via the --server argument. This is a high-risk pattern that allows for shell injection if the input contains shell metacharacters, potentially enabling arbitrary command execution on the host system.
  • [COMMAND_EXECUTION]: The skill uses the agent-browser CLI tool through a broad Bash(agent-browser:*) permission. This grants the agent extensive capabilities to interact with the system and network based on commands that may be influenced by untrusted external data.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection due to its core function of processing external web content.
  • Ingestion points: The agent-browser snapshot and agent-browser open commands read the Document Object Model (DOM) and interactive element descriptions from potentially untrusted web pages into the agent's working context.
  • Boundary markers: The skill does not implement delimiters or provide explicit instructions to the agent to treat the retrieved web content as untrusted data or to ignore embedded instructions.
  • Capability inventory: The agent has the ability to execute shell commands via the scripts/with_server.py wrapper and perform complex, stateful browser interactions that could be subverted by an attacker.
  • Sanitization: No sanitization, validation, or filtering of the retrieved HTML/DOM content is performed before it is presented to the language model for analysis and decision-making.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 27, 2026, 04:45 AM