webapp-testing
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The skill utilizes a wrapper script
scripts/with_server.pythat takes raw strings as input for the--serverargument and executes them. This allows for arbitrary shell command execution (e.g.,npm run dev,python server.py). - [PROMPT_INJECTION] (HIGH): The instructions include an anti-analysis pattern: 'DO NOT read the source until you try running the script first'. This explicitly directs the agent to execute code without performing a security review, which could be used to hide malicious logic in
scripts/with_server.pyor other local scripts. - [INDIRECT_PROMPT_INJECTION] (HIGH): The skill has a high-risk capability tier for indirect injection.
- Ingestion points: The agent is instructed to use
page.content()andpage.locator()to read data from potentially untrusted local or remote web applications inSKILL.md. - Boundary markers: None. There are no instructions to treat web content as data rather than instructions.
- Capability inventory: The agent can execute shell commands via
with_server.pyand write/run new Python Playwright scripts. - Sanitization: None. The agent directly uses discovered selectors and content to inform its next actions, including command-line operations.
- [COMMAND_EXECUTION] (MEDIUM): The metadata field
scope: [root]suggests the skill is intended to run with elevated privileges, which exacerbates the risk of the command execution patterns found.
Recommendations
- AI detected serious security threats
Audit Metadata