webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The script
scripts/with_server.pyusessubprocess.Popenwith theshell=Trueparameter to execute commands provided via the--serverargument. This allows for arbitrary shell command execution and is highly vulnerable to shell injection attacks. - SOCIAL_ENGINEERING (MEDIUM): The documentation in
SKILL.mdexplicitly instructs the agent to avoid reading the source code of scripts before running them. This is an adversarial pattern intended to prevent the agent from identifying the dangerous execution logic within the helper scripts. - INDIRECT_PROMPT_INJECTION (LOW): The skill is designed to browse untrusted web pages and inspect their DOM content, which presents an attack surface where malicious websites could attempt to influence the agent's actions. Evidence Chain: 1. Ingestion point:
page.goto()andpage.content()inexamples/element_discovery.py. 2. Boundary markers: None present. 3. Capability inventory: Arbitrary shell execution viawith_server.pyand file system writes. 4. Sanitization: No sanitization of web content before it is used to decide on subsequent actions.
Recommendations
- AI detected serious security threats
Audit Metadata