webapp-testing

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The file scripts/with_server.py uses subprocess.Popen with shell=True on inputs provided via the --server command-line argument. This allows for arbitrary shell command execution and is highly vulnerable to command injection if the agent is coerced into passing unvalidated or malicious strings to the script.
  • PROMPT_INJECTION (LOW): The SKILL.md file contains instructions explicitly telling the agent: 'DO NOT read the source until you try running the script first'. This pattern is an attempt to bypass agent oversight and safety analysis by discouraging the inspection of code before execution.
  • INDIRECT_PROMPT_INJECTION (LOW): The skill is designed to ingest untrusted data from local or remote web applications (HTML content, console logs, element text).
  • Ingestion points: page.content(), page.on("console", ...), and button.inner_text() in various example scripts.
  • Boundary markers: None present in the provided examples to delimit external data from agent instructions.
  • Capability inventory: File writing (open().write()), arbitrary command execution via scripts/with_server.py, and browser automation.
  • Sanitization: No evidence of sanitization or filtering of the ingested web content before it is processed by the agent.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:14 PM