webapp_testing
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The script
scripts/with_server.pyutilizessubprocess.Popenwithshell=Trueto execute strings provided via the--servercommand-line argument. It also usessubprocess.runto execute the trailing command sequence. This allows for arbitrary shell command execution with the privileges of the agent process. Evidence:scripts/with_server.pylines 86-90 and line 105. - REMOTE_CODE_EXECUTION (HIGH): The skill is designed to have the AI agent 'write native Python Playwright scripts' and execute them at runtime. This dynamic code generation and execution, when paired with the system-level command execution capabilities of the provided scripts, constitutes a remote code execution risk. Evidence:
SKILL.mdinstruction block. - PROMPT_INJECTION (HIGH): The skill is highly vulnerable to Indirect Prompt Injection (Category 8). It lacks sanitization and boundary markers for external data.
- Ingestion points: Untrusted data enters the agent context via
page.goto()inexamples/console_logging.py,examples/element_discovery.py, andexamples/static_html_automation.py(which supports localfile://URLs). - Boundary markers: No delimiters or instructions to ignore embedded commands are present in the prompts.
- Capability inventory: The skill has the capability to execute shell commands (
subprocess.Popeninwith_server.py), write files (open().write()inexamples/console_logging.py), and capture system state (screenshots). - Sanitization: No escaping or validation of web content is performed before processing.
- PROMPT_INJECTION (MEDIUM): The
SKILL.mdfile contains a deceptive instruction: 'DO NOT read the source until you try running the script first'. This pattern attempts to discourage the agent or an auditor from inspecting the underlying code, potentially hiding malicious behavior in the large scripts it describes. Evidence:SKILL.mdline 14.
Recommendations
- AI detected serious security threats
Audit Metadata