webapp-testing

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The script scripts/with_server.py uses subprocess.Popen with shell=True to execute commands passed via the --server flag. This creates a critical command injection vulnerability where any string provided to the script is executed by the system shell.
  • DATA_EXFILTRATION (HIGH): The skill demonstrates and encourages the use of file:// URLs within Playwright to access local files. This capability allows the agent to read sensitive local data, which can then be captured via screenshots or page.content() and exfiltrated.
  • PROMPT_INJECTION (MEDIUM): The SKILL.md file contains a directive ('DO NOT read the source until you try running the script first') that discourages the AI from inspecting its own scripts. This is an obfuscation-like tactic that prevents the agent from detecting the dangerous shell execution patterns in its dependencies.
  • DATA_EXFILTRATION (LOW): Indirect Prompt Injection vulnerability. The skill ingests untrusted data from external websites during testing. Ingestion points: page.goto() and page.content() in example scripts. Boundary markers: None provided in instructions or scripts. Capability inventory: Arbitrary shell execution via with_server.py and file system writes in /mnt/user-data/. Sanitization: None performed on external content.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:14 PM