webapp-testing

Fail

Audited by Gen Agent Trust Hub on Feb 20, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The helper script scripts/with_server.py is designed to execute arbitrary shell commands passed via the --server and trailing arguments. It uses subprocess.Popen with shell=True, which allows for command injection if the arguments are not strictly controlled. The skill instructions in SKILL.md further exacerbate this risk by explicitly discouraging the agent from auditing the script's source code before execution.\n- PROMPT_INJECTION (LOW): The skill is highly susceptible to indirect prompt injection because it reads and processes content from external, potentially untrusted web pages and browser console logs.\n
  • Ingestion points: Content is read into the agent's context using page.content(), locator().inner_text(), and page.on('console') listeners in files like element_discovery.py and console_logging.py.\n
  • Boundary markers: No delimiters or safety instructions are used to distinguish between system instructions and data ingested from the web.\n
  • Capability inventory: The agent has access to powerful capabilities including arbitrary command execution (with_server.py) and filesystem access (os.makedirs, open().write()).\n
  • Sanitization: The skill performs no sanitization or validation of the text extracted from the browser, which could contain instructions intended to hijack the agent's behavior.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 20, 2026, 02:48 PM