webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The script
scripts/with_server.pyusessubprocess.Popen(shell=True)andsubprocess.run()to execute strings provided via command-line arguments. This provides a direct interface for arbitrary shell command execution. - PROMPT_INJECTION (HIGH): The skill creates a significant Indirect Prompt Injection surface by processing untrusted web content. Evidence: 1. Ingestion points:
page.content()andpage.locator().all()inSKILL.mdextract data from the browser. 2. Boundary markers: Absent; no delimiters or instructions are used to separate web content from agent instructions. 3. Capability inventory: High-privilege shell execution viawith_server.pyand local file writing viapage.screenshot(). 4. Sanitization: Absent; no filtering or validation of the ingested DOM content is performed before the agent uses it to decide on subsequent actions. - REMOTE_CODE_EXECUTION (HIGH): The skill's core workflow involves the agent dynamically writing and executing Python Playwright scripts, which provides an attacker with a path to execute arbitrary code if they can influence the agent's logic through the browser content.
- PROMPT_INJECTION (MEDIUM): The
SKILL.mdfile contains a deceptive instruction: "DO NOT read the source until you try running the script first". This metadata poisoning discourages the agent from performing necessary security audits on the scripts it executes.
Recommendations
- AI detected serious security threats
Audit Metadata