webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION] (HIGH): The documentation in
SKILL.mdcontains a deceptive directive instructing the agent to 'DO NOT read the source until you try running the script first'. This is an attempt to bypass security analysis by treating scripts as 'black boxes', effectively preventing the model from auditing the code for malicious behavior before execution. - [COMMAND_EXECUTION] (HIGH): The
scripts/with_server.pyutility usessubprocess.Popenwithshell=Trueto execute commands passed via the--serverargument. This allows for arbitrary shell command injection on the host environment. - [PROMPT_INJECTION] (LOW): The 'Reconnaissance-Then-Action' pattern constitutes a significant surface for Indirect Prompt Injection (Category 8). The agent ingests untrusted data from the web application's DOM and console logs to determine its next actions. Ingestion points:
page.content(),page.locator().all(), andhandle_console_message. Boundary markers: None. Capability inventory: Arbitrary shell command execution viawith_server.pyand file writing inexamples/console_logging.py. Sanitization: None; the agent is encouraged to use discovered content directly as selectors or logic inputs.
Recommendations
- AI detected serious security threats
Audit Metadata