webapp-testing
Fail
Audited by Gen Agent Trust Hub on Mar 1, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The script
scripts/with_server.pyis designed to execute arbitrary shell commands provided via the--serverargument usingsubprocess.Popenwithshell=True. It also executes a secondary command provided as a positional argument usingsubprocess.run. This pattern allows an attacker who can influence the agent's parameters to execute unauthorized commands on the host system. - [PROMPT_INJECTION]: The
SKILL.mdfile contains instructions that explicitly tell the agent not to read the source code of the scripts before running them ("DO NOT read the source until you try running the script first"). This discourages the model from performing security checks on the dangerous command execution logic within the helper scripts. - [PROMPT_INJECTION]: The skill is highly vulnerable to indirect prompt injection because its primary function is to scrape and interact with web application data which is then processed by the agent.
- Ingestion points: Data is ingested from the browser via
page.content(),button.inner_text(), and browser console logs inexamples/element_discovery.pyandexamples/console_logging.py. - Boundary markers: There are no boundary markers or specific instructions to ignore malicious content found within the web pages or logs being tested.
- Capability inventory: The agent has access to powerful capabilities, including arbitrary shell execution via
scripts/with_server.pyand the ability to write files to the local filesystem. - Sanitization: No sanitization or validation is performed on the data retrieved from the web browser before it is used by the agent to make decisions or identify selectors.
Recommendations
- AI detected serious security threats
Audit Metadata