webapp-testing

Warn

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION] (MEDIUM): The script scripts/with_server.py utilizes subprocess.Popen with shell=True to execute server commands provided as command-line arguments. This is a common vector for command injection if inputs are not strictly validated.
  • [COMMAND_EXECUTION] (MEDIUM): scripts/with_server.py executes arbitrary trailing commands using subprocess.run(args.command), allowing the agent or an attacker to run any system command.
  • [REMOTE_CODE_EXECUTION] (MEDIUM): The skill's core design relies on the agent writing and executing its own Python scripts using Playwright to interact with local servers, which constitutes dynamic code generation and execution.
  • [PROMPT_INJECTION] (LOW): SKILL.md includes meta-instructions ("DO NOT read the source until you try running the script first") that attempt to influence the agent's investigative behavior and prioritize execution over inspection.
  • [DATA_EXPOSURE] (SAFE): The example scripts (examples/console_logging.py, examples/element_discovery.py, examples/static_html_automation.py) write screenshots and log files to /tmp and /mnt/user-data/outputs/. This is expected behavior for a testing toolkit and is considered safe within the intended environment.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 17, 2026, 06:27 PM