webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The script
scripts/with_server.pyusessubprocess.Popen(shell=True)to execute strings provided via the--serverargument andsubprocess.run()for trailing command arguments. This design provides a direct interface for arbitrary shell command execution without any sanitization or validation of the input strings. - [PROMPT_INJECTION] (MEDIUM): The
SKILL.mdfile contains instructions that explicitly tell the AI agent, 'DO NOT read the source until you try running the script first'. This is a deceptive pattern that attempts to bypass the agent's ability to audit executable code before invocation, potentially hiding the high-risk nature of theshell=Trueimplementation. - [REMOTE_CODE_EXECUTION] (HIGH): By combining the ability to execute arbitrary shell commands with the primary function of the skill (navigating to and interacting with web applications), the skill creates a significant surface for Remote Code Execution. An attacker-controlled web page could provide instructions (Indirect Prompt Injection) that trigger the use of
with_server.pywith malicious payloads. - [INDIRECT_PROMPT_INJECTION] (LOW): The skill possesses a complete evidence chain for indirect injection vulnerabilities:
- Ingestion points:
page.goto()in all example scripts andfile://URL handling instatic_html_automation.py. - Boundary markers: None present to distinguish between trusted instructions and data from web pages.
- Capability inventory: Full shell access via
scripts/with_server.pyand file system writes via Playwright screenshots and log output. - Sanitization: No input validation is performed on command strings before execution.
Recommendations
- AI detected serious security threats
Audit Metadata