webapp-testing

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONCREDENTIALS_UNSAFE
Full Analysis
  • [COMMAND_EXECUTION] (HIGH): The script 'scripts/with_server.py' uses 'subprocess.Popen(shell=True)' to execute strings provided via the '--server' argument, which enables arbitrary shell command injection (Evidence: scripts/with_server.py, line 85).
  • [CREDENTIALS_UNSAFE] (MEDIUM): 'test_links_flow.py' contains a hardcoded password '123456' used for automated login (Evidence: test_links_flow.py, line 38).
  • [PROMPT_INJECTION] (MEDIUM): The SKILL.md file contains a metadata poisoning attempt by instructing the agent 'DO NOT read the source until you try running the script first', discouraging security inspection of the scripts.
  • [DATA_EXFILTRATION] (LOW): Multiple scripts save screenshots and console logs to predictable paths (/tmp/, /mnt/user-data/outputs/) and print session cookies to the console (Evidence: test_links_flow.py, examples/console_logging.py).
  • [INDIRECT_PROMPT_INJECTION] (LOW): The skill processes untrusted web data using Playwright, creating an attack surface that could be exploited via indirect prompt injection to trigger the existing command execution capabilities. Ingestion points: page.content(), page.locator().all() in element_discovery.py and test_links_flow.py. Capability inventory: subprocess.Popen (shell=True) and subprocess.run in scripts/with_server.py. Boundary markers: Absent. Sanitization: Absent.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:28 PM