webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [Command Execution] (HIGH): The skill's core functionality relies on executing arbitrary shell commands provided via the
--serverargument towith_server.py(e.g.,npm run dev,cd backend && python server.py). This allows for unauthorized command execution if the server command strings are manipulated. - [Obfuscation/Safety Bypass] (MEDIUM): The skill contains an explicit 'Don't Look' instruction: 'DO NOT read the source until you try running the script first'. This encourages the agent to bypass security auditing of local scripts, treating them as 'black boxes', which is a major security anti-pattern for AI agents.
- [Indirect Prompt Injection] (HIGH): The skill is designed to process untrusted data from web applications.
- Ingestion points: External data enters the context via
page.content(), DOM inspection, and screenshots as described in the 'Reconnaissance-Then-Action Pattern'. - Boundary markers: No boundary markers or delimiters are suggested to separate untrusted web content from agent instructions.
- Capability inventory: The skill provides capabilities for shell command execution and Python script execution.
- Sanitization: There is no evidence of sanitization or filtering of the web content before it is processed by the agent to determine its next actions.
Recommendations
- AI detected serious security threats
Audit Metadata