webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [Command Execution] (HIGH): The skill provides a helper script
scripts/with_server.pythat accepts arbitrary shell commands via the--serverflag (e.g.,python scripts/with_server.py --server "npm run dev"). This allows for arbitrary command execution on the host system. - [Indirect Prompt Injection] (HIGH): The skill's core purpose is to interact with web applications. This creates a large attack surface where malicious content within a web page (e.g., hidden HTML comments or visible text) could trick the agent into performing unintended actions or exfiltrating data, especially given the agent's high-privilege capabilities (file system access, browser control).
- [Prompt Injection] (MEDIUM): The instructions explicitly tell the agent: 'DO NOT read the source until you try running the script first' and 'These scripts exist to be called directly as black-box scripts rather than ingested into your context window.' This discourages the agent from verifying the safety of the code it executes, which is a common obfuscation tactic to hide malicious behavior within the provided scripts.
- [Dynamic Execution] (HIGH): The 'Decision Tree' and 'Example' sections guide the agent to dynamically generate and execute Python Playwright scripts. Executing dynamically generated code based on potentially untrusted web content (selectors, page structure) significantly increases the risk of exploitation.
Recommendations
- AI detected serious security threats
Audit Metadata