browsing-with-playwright
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTION
Full Analysis
- Indirect Prompt Injection (HIGH): The skill is designed to process untrusted data from the web, creating a significant attack surface for indirect instructions. * Ingestion points: Tools like
browser_navigateandbrowser_snapshot(inSKILL.md) bring external, attacker-controlled content into the agent's reasoning context. * Boundary markers: Absent. There are no delimiters or specific instructions provided to the agent to ignore or isolate commands found within web pages. * Capability inventory: The skill provides dangerous capabilities includingbrowser_click,browser_type,browser_fill_form, andbrowser_run_code(JS execution). * Sanitization: Absent. Web content is processed without escaping or filtering, allowing embedded instructions to potentially hijack the agent's workflow. - Unverifiable Dependencies (LOW): The
scripts/start-server.shscript executesnpx @playwright/mcp@latest. While the@playwrightscope is associated with Microsoft (a trusted source), fetching@latestat runtime introduces a risk of supply chain shifts. Per [TRUST-SCOPE-RULE], this download is downgraded to LOW/INFO due to the trusted provider, but remains a notable behavior. - Remote Code Execution (HIGH): The
browser_run_codeandbrowser_evaluatetools documented inSKILL.mdallow the agent to execute arbitrary JavaScript within the browser context. When combined with navigation to untrusted sites, this allows an attacker to potentially influence the agent to execute malicious scripts locally within the browser session. - Command Execution (LOW): The skill relies on several local shell scripts (
start-server.sh,stop-server.sh) and subprocess calls inverify.pyto manage the server lifecycle. These are standard operational requirements but represent a local execution footprint.
Recommendations
- AI detected serious security threats
Audit Metadata