webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 19, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The script 'scripts/with_server.py' uses 'subprocess.Popen(shell=True)' to execute commands provided via the '--server' argument. This is a high-risk pattern that allows for arbitrary shell command injection if the input strings are not strictly validated.
- [PROMPT_INJECTION] (MEDIUM): 'SKILL.md' includes instructions ('DO NOT read the source until you try running the script first') that attempt to override the agent's standard behavior of analyzing code for safety before execution. This serves as an evasion tactic to prevent the agent from identifying potentially risky code logic.
- [INDIRECT_PROMPT_INJECTION] (LOW): The skill processes untrusted web content which can manipulate agent behavior. 1. Ingestion points: 'page.content()' and 'button.inner_text()' in 'examples/element_discovery.py'. 2. Boundary markers: Absent. 3. Capability inventory: Arbitrary shell execution via 'scripts/with_server.py'. 4. Sanitization: Absent.
- [DATA_EXPOSURE] (LOW): 'examples/console_logging.py' captures and writes browser console logs to '/mnt/user-data/outputs/console.log', which could inadvertently leak sensitive session information or application data to a persistent file.
Recommendations
- AI detected serious security threats
Audit Metadata